Sun Cluster 3.0 5/02 Error Messages Guide

Previous: Chapter 9 Message IDs 800000 - 899999

Error Message List

The following list is ordered by the message ID.

900102 :Failed to retrieve the resource type property %s: %s.

Description:

An API operation has failed while retrieving the resource type property. Low memory or API call failure might be the reasons.

Solution:

In case of low memory, the problem will probably cured by rebooting. If the problem reoccurs, you might need to increase swap space by configuring additional swap devices. Otherwise, if it is API call failure, check the syslog messages from other components.

900206 :parameter '%s%s' must be an integer "%s". Using default value of %d.

Description:

Using a default value for a parameter.

Solution:

None.

900675 :cluster volume manager shared access mode enabled

Description:

Message indicating shared access availability of the volume manager.

Solution:

None.

900843:Retrying to retrieve the cluster information.

Description:

An update to cluster configuration occurred while cluster properties were being retrieved

Solution:

Ignore the message.

900954 :fatal: Unable to open CCR

Description:

The rgmd was unable to open the cluster configuration repository (CCR). The rgmd will produce a core file and will force the node to halt or reboot to avoid the possibility of data corruption.

Solution:

Save a copy of the /var/adm/messages files on all nodes, and of the rgmd core file. Contact your authorized Sun service provider for assistance in diagnosing the problem.

901030 :clconf: Data length is more than max supported length in clconf_file_io

Description:

In reading configuration data through CCR FILE interface, found the data length is more than max supported length.

Solution:

Check the CCR configuration information.

901066:Monitor_retry_count is not set.

Description:

The resource property Monitor_retry_count is not set. This property controls the number of restart attempts of the fault monitor.

Solution:

Ensure that this property is set. Use the scrgadm(1M) command to set this property.

902721 :Switching over resource group using scha_control GIVEOVER

Description:

Fault monitor has detected problems in RDBMS server. Fault monitor has determined that RDBMS server cannot be restarted on this node. Attempt will be made to switchover the resource to any other node, if a healthy node is available.

Solution:

Check the cause of RDBMS failure.

903007 :txmit_common: udp is null!

Description:

Can not transmit a message and communicate with udlmctl because the address to send to is null.

Solution:

None.

903200 :PNM: nafo%d: could not start monitoring

Description:

An internal resource problem has stopped PNM from performing monitoring of the named NAFO group. Fault detection and failover for adapters in the group are therefore disabled.

Solution:

Send a KILL (9) signal to the pnmd daemon. Because pnmd is under PMF control, it will be restarted automatically. If the problem persists, restart the node with scswitch -S and shutdown(1M).

903317 :Entry at position %d in property %s was invalid.

Description:

An invalid entry was found in the named property. The position index, which starts at 0 for the first element in the list, indicates which element in the property list was invalid.

Solution:

Make sure the property has a valid value.

903734 :Failed to create lock directory %s: %s.

Description:

This network resource failed to create a directory in which to store lock files. These lock files are needed to serialize the running of the same callback method on the same adapter for multiple resources.

Solution:

This might be the result of a lack of system resources. Check whether the system is low in memory and take appropriate action. For specific error information, check the syslog message.

905023 :clexecd: dup2 of stderr returned with errno %d while exec'ing (%s). Exiting.

Description:

clexecd program has encountered a failed dup2(2) system call. The error message indicates the error number for the failure.

Solution:

The clexecd program will exit and the node will be halted or rebooted to prevent data corruption. Contact your authorized Sun service provider to determine whether a workaround or patch is available.

905591 :Could not determine volume configuration daemon mode

Description:

Could not get information about the volume manager daemon mode.

Solution:

Check if the volume manager has been started up right.

905720 :Failed to get NAFO status for group %s (request failed with %d).

Description:

A query to get the state of a NAFO group failed. This may cause a method failure to occur.

Solution:

Make sure the network monitoring daemon (pnmd) is running. Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.

908240 :in libsecurity realloc failed

Description:

A server (rpc.pmfd, rpc.fed or rgmd) was not able to start, or a client was not able to make an rpc connection to the server, probably due to low memory. An error message is output to syslog.

Solution:

Investigate if the host is low on memory. If not, save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.

908387 :ucm_callback for step %d generated exception %d

Description:

ucmm callback for a step failed.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.

908591 :Failed to stop fault monitor.

Description:

An attempt was made to stop the fault monitor and it failed. There may be prior messages in syslog indicating specific problems.

Solution:

If there are prior messages in syslog indicating specific problems, these should be corrected. If that doesn't resolve the issue, the user can try the following. Use process monitor facility (pmfadm (1M)) with -L option to retrieve all the tags that are running on the server. Identify the tag name for the fault monitor of this resource. This can be easily identified as the tag ends in string ".mon" and contains the resource group name and the resource name. Then use pmfadm (1M) with -s option to stop the fault monitor. This problem may occur when the cluster is under load and Sun Cluster cannot stop the fault monitor within the timeout period specified. You may consider increasing the Monitor_Stop_timeout property. If the error still persists, then reboot the node.

908716 :fatal: Aborting this node because method <%s> failed on resource <%s> and Failover_mode is set to HARD

Description:

A STOP method has failed or timed out on a resource, and the Failover_mode property of that resource is set to HARD.

Solution:

No action is required. This is normal behavior of the RGM. With Failover_mode set to HARD, the rgmd reboots the node to force the resource group offline so that it can be switched onto another node. Other syslog messages that occurred just before this one might indicate the cause of the STOP method failure.

909656:Unable to open /dev/kmem:%s.

Description:

Sun Cluster HA for NFS fault monitor attempted to open the device but failed. The specific cause of the failure is logged with the error message. The /dev/kmem interface is used to read NFS activity counters from kernel. Sun Cluster HA for NFS fault monitor should ignore this error and try to open the device again later. Since it is unable to read NFS activity counters from kernel, Sun Cluster HA for NFS should attempt to contact nfsd by means of a NULL RPC. This failure might occur if there is a lack of system resources.

Solution:

Attempt to free memory by terminating any programs that are using large amounts of memory and swap. If this error persists, reboot the node.

911176:Successfully started BV on %s.

Description:

The Sun Cluster HA for BroadVision One-To-One Enterprise servers and daemons on the specified host successfully started.

Solution:

No user action required.

912352 :Could not unplumb some ip addresses.

Description:

Some of the ip addresses managed by the LogicalHostname resource were not successfully brought offline on this node.

Solution:

Use the ifconfig command to make sure that the ip addresses are indeed absent. Check for any error message before this error message for a more precise reason for this error. Use scswitch command to move the resource group to a different node. If problem persists, reboot.

912696:The action to be taken as determined by scds_fm_action is failover. However the application is not being failed over because the failover_enabled extension property is set to false. Restarting the application instead.

Description:

Property failover_enabled is set to false. The probe is trying to restart application locally, instead of failover.

Solution:

This is an informational message, no user action is needed.

913241:Could not Start BV 121 processes on $HOSTNAME.

Description:

The Sun Cluster HA for BroadVision One-To-One Enterprise servers could not start on the specified host. This failure occurs if the orbix daemon does not properly start or if there are any configuration errors.

Solution:

See if there are any internal errors. Verify the Sun Cluster HA for BroadVision One-To-One Enterprise configuration. Manually start Sun Cluster HA for BroadVision One-To-One Enterprise on the specified host. If the orbix daemon does not start, contact your authorized Sun service provider for assistance. Provide your authorized Sun service provider a copy of the /var/adm/messages files from all nodes.

913654 :%s not specified on command line.

Description:

Required property not included in a scrgadm command.

Solution:

Reissue the scrgadm command with the required property and value.

914519 :Error when sending message to child %m

Description:

Error occurred when communicating with fault monitor child process. Child process will be stopped and restarted.

Solution:

If error persists, then disable the fault monitor and report the problem.

914655 :Restarting the resource %s.

Description:

The process monitoring facility tried to send a message to the fault monitor noting that the data service application died. It was unable to do so.

Solution:

Since some part (daemon) on the application has failed, it would be restarted. If fault monitor is not yet started, wait for it to be started by Sun Cluster framework. If fault monitor has been disabled, enable it using scswitch.

914866 :Unable to complete some unshare commands.

Description:

HA-NFS postnet_stop method was unable to complete the unshare(1M) command for some of the paths specified in the dfstab file.

Solution:

The exact pathnames which failed to be unshared would have been logged in earlier messages. Run those unshare commands by hand. If problem persists, reboot the node.

917591 :fatal: Resource type <%s> update failed with error <%d>; aborting node

Description:

Rgmd failed to read updated resource type from the CCR on this node.

Solution:

Save a copy of the /var/adm/messages files on all nodes, and of the rgmd core file. Contact your authorized Sun service provider for assistance in diagnosing the problem.

918018 :load balancer for group '%s' released

Description:

This message indicates that the service group has been deleted.

Solution:

This is an informational message, no user action is needed.

918488:Validation failed. Invalid command line parameter %s %s.

Description:

Unable to process parameters passed to the callback method specified. This is a Sun Cluster HA for Sybase internal error.

Solution:

Report this problem to your authorized Sun service provider.

918651:Could not start the text server.

Description:

HA for Sybase failed to start the Text Server. Other syslog messages and the log file should provide additional information on possible reasons for the failure.

Solution:

Manually start the Text Server. Examine the log files and setup. See if the START method timeout value is set too low.

919740 :WARNING: error in translating address (%s) for nodeid %d

Description:

Could not get address for a node.

Solution:

Make sure the node is booted as part of a cluster.

919860 :scvxvmlg warning - %s does not link to %s, changing it

Description:

The program responsible for maintaining the VxVM device namespace has discovered inconsistencies between the VxVM device namespace on this node and the VxVM configuration information stored in the cluster device configuration system. If configuration changes were made recently, then this message should reflect one of the configuration changes. If no changes were made recently or if this message does not correctly reflect a change that has been made, the VxVM device namespace on this node may be in an inconsistent state. VxVM volumes may be inaccessible from this node.

Solution:

If this message correctly reflects a configuration change to VxVM diskgroups then no action is required. If the change this message reflects is not correct, then the information stored in the device configuration system for each VxVM diskgroup should be examined for correctness. If the information in the device configuration system is accurate, then executing '/usr/cluster/lib/dcs/scvxvmlg' on this node should restore the device namespace. If the information stored in the device configuration system is not accurate, it must be updated by executing '/usr/cluster/bin/scconf -c -D name=diskgroup_name' for each VxVM diskgroup with inconsistent information.

920103 :created %d threads to handle resource group switchback; desired number = %d

Description:

The rgmd was unable to create the desired number of threads upon starting up. This is not a fatal error, but it may cause RGM reconfigurations to take longer because it will limit the number of tasks that the rgmd can perform concurrently.

Solution:

Make sure that the hardware configuration meets documented minimum requirements. Examine other syslog messages on the same node to see if the cause of the problem can be determined.

920736 :Unknown transport type: %s.

Description:

The transport type used is not known to Solaris Clustering.

Solution:

Need an user action for this message.

922085 :INTERNAL ERROR CMM: Memory allocation error.

Description:

The CMM failed to allocate memory during initialization.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.

922330:getservbyname() failed : %s.

Description:

Entries for the required services are missing.

Solution:

Verify the sources for the services specified in the /etc/nsswitch.conf file. These should have the entry for the required service.

922363 :resource %s status msg on node %s change to <%s>

Description:

This is a notification from the rgmd that a resource's fault monitor status message has changed.

Solution:

This is an informational message, no user action is needed.

923184 :CMM: Scrub failed for quorum device %s.

Description:

The scrub operation for the specified quorum device failed, due to which this quorum device will not be added to the cluster.

Solution:

There may be other related messages on this node that may indicate the cause of this problem. Refer to the disk repair section of the administration guide for resolving this problem. After the problem has been resolved, retry adding the quorum device.

923618 :Prog <%s>: unknown command.

Description:

An internal error in ucmmd has prevented it from successfully executing a program.

Solution:

Save a copy of /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.

923712 :CCR: Table %s on joining node %s has the same version but different checksum as the copy in the current membership. The table on the joining node will be replaced by the one in the current membership.

Description:

The indicated table on the joining node has the same version but different contents as the one in the current membership. It will be replaced by the one in the current membership.

Solution:

This is an informational message, no user action is needed.

923959 :Warning: out of swap space; cannot store start-failed timestamp for resource group <%s>

Description:

The specified resource group failed to come online on some node, but this node is unable to record that fact due to insufficient swap space. The system is likely to halt or reboot if swap space continues to be depleted.

Solution:

Investigate the cause of swap space depletion and correct the problem, if possible.

925953 :reservation error(%s) - do_scsi3_register() error for disk %s

Description:

The device fencing program has encountered errors while trying to access a device. All retry attempts have failed.

Solution:

For the user action required by this message, see the user action for message 192619.

926201 :RGM aborting

Description:

A fatal error has occurred in the rgmd. The rgmd will produce a core file and will force the node to halt or reboot to avoid the possibility of data corruption.

Solution:

Save a copy of the /var/adm/messages files on all nodes, and of the rgmd core file. Contact your authorized Sun service provider for assistance in diagnosing the problem.

926550 :scha_cluster_get failed for (%s) with %d

Description:

Call to get cluster information failed. The second part of the message gives the error code.

Solution:

The calling program should handle this error. If it is not recoverable, it will exit.

926749 :Resource group nodelist is empty.

Description:

Empty value was specified for the nodelist property of the resource group.

Solution:

Any of the following situations might have occurred. Different user action is required for these different scenarios. 1) If a new resource is created or updated, check the value of the nodelist property. If it is empty or not valid, then provide valid value using scrgadm(1M) command. 2) For all other cases, treat it as an Internal error. Contact your authorized Sun service provider.

927042 :Validation failed. SYBASE backup server startup file RUN_%s not found SYBASE=%s.

Description:

Backup server was specified in the extension property Backup_Server_Name. However, Backup server startup file was not found. Backup server startup file is expected to be: $SYBASE/$SYBASE_ASE/install/RUN_<Backup_Server_Name>

Solution:

Check the Backup server name specified in the Backup_Server_Name property. Verify that SYBASE and SYBASE_ASE environment variables are set property in the Environment_file. Verify that RUN_<Backup_Server_Name> file exists.

927753 :Fault monitor does not exist or is not executable

Description:

Fault monitor program specified in support file is not executable or does not exist. Recheck your installation.

Solution:

Please report this problem.

927846 :fatal: Received unexpected result <%d> from rpc.fed, aborting node

Description:

A serious error has occurred in the communication between rgmd and rpc.fed while attempting to execute a VALIDATE method. The rgmd will produce a core file and will force the node to halt or reboot to avoid the possibility of data corruption.

Solution:

Save a copy of the /var/adm/messages files on all nodes, and of the rgmd core file. Contact your authorized Sun service provider for assistance in diagnosing the problem.

928235 :Validation failed. Adaptive_Server_Log_File %s not found.

Description:

File specified in the Adaptive_Server_Log_File extension property was not found. The Adaptive_Server_Log_File is used by the fault monitor for monitoring the server.

Solution:

Please check that file specified in the Adaptive_Server_Log_File extension property is accessible on all the nodes.

928382 :CCR: Failed to read table %s on node %s.

Description:

The CCR failed to read the indicated table on the indicated node. The CCR will attempt to recover this table from other nodes in the cluster.

Solution:

This is an informational message, no user action is needed.

928455 :clcomm: Couldn't write to routing socket: %d

Description:

The system prepares IP communications across the private interconnect. A write operation to the routing socket failed.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.

929252 :Failed to start HA-NFS system fault monitor.

Description:

Process monitor facility has failed to start the HA-NFS system fault monitor.

Solution:

Check whether the system is low in memory or the process table is full and correct these problems. If the error persists, use scswitch to switch the resource group to another node.

929712 :Share path %s: file system %s is not global.

Description:

The specified share path exists on a file system which is neither mounted via /etc/vfstab nor is a global file system.

Solution:

Share paths in HA-NFS must satisfy this requirement.

929712 :Share path %s: file system %s is not global.

Description:

The specified share path exists on a file system which is neither mounted via /etc/vfstab nor is a global file system.

Solution:

Share paths in HA-NFS must satisfy this requirement.

930059:%s: %s.

Description:

Sun Cluster HA for SAP failed to access a file. The file in question is specified with the first %s. The reason it failed is provided with the second %s.

Solution:

Ensure that the file is accessible using the path list.

931677 :Could not reset SCSI buses on CMM configuration. Could not find clexecd in nameserver.

Description:

An error occurred with the SC3.0 software was in the process of resetting SCSI buses with shared nodes that are down.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.

934249:Invalid value for property %s: %d.

Description:

An invalid value may have been specified for the property.

Solution:

Please set correct value for the property and retry the operation.

935576:$HOSTNAME is not configured for Sun Cluster HA for BroadVision One-To-One Enterprise processes.

Description:

The specified host is not configured for Sun Cluster HA for BroadVision One-To-One Enterprise processes.

Solution:

Configure Sun Cluster HA for BroadVision One-To-One Enterprise processes to run on this host, or create the resource group with the correct network resource and Sun Cluster HA for BroadVision One-To-One Enterprise resource in the resource group.

936306 :svc_setschedprio: Could not setup RT (real time) scheduling parameters: %s

Description:

The server was not able to set the scheduling mode parameters, and the system error is shown. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.

937669 :CCR: Failed to update table %s.

Description:

The CCR data server failed to update the indicated table.

Solution:

There may be other related messages on this node, which may help diagnose the problem. If the root file system is full on the node, then free up some space by removing unnecessary files. If the root disk on the afflicted node has failed, then it needs to be replaced.

938189 :rpc.fed: program not registered or server not started

Description:

The rpc.fed daemon was not correctly initialized, or has died. This has caused a step invocation failure, and may cause the node to reboot.

Solution:

Check that the Solaris Clustering software has been installed correctly. Use ps(1) to check if rpc.fed is running. If the installation is correct, the reboot should restart all the required daemons including rpc.fed.

938618 :Couldn't create deleted subdir %s: error (%d)

Description:

While mounting this file system, cluster file system was unable to create some directories that it reserves for internal use.

Solution:

If the error is 28(ENOSPC), then mount this FS non-globally, make some space, and then mount it globally. If there is some other error, and you are unable to correct it, contact your authorized Sun service provider to determine whether a workaround or patch is available.

938836 :invalid value for parameter '%sfailfast': "%s". Using default value of 'panic'

Description:

/opt/SUNWudlm/etc/udlm.conf did not have a valid entry for failfast mode. Default mode of 'panic' will be used.

Solution:

None.

939374 :CCR: Failed to access cluster repository during synchronization. ABORT node.

Description:

This node failed to access its cluster repository when it first came up in cluster mode and tried to synchronize its repository with other nodes in the cluster.

Solution:

This is usually caused by an unrecoverable failure such as disk failure. There may be other related messages on this node, which may help diagnose the problem. If the root disk on the afflicted node has failed, then it needs to be replaced. If the root disk is full on this node, boot the node into non-cluster mode and free up some space by removing unnecessary files.

940685:Configuration file %s missing for NetBackup.

Description:

The configuration file for Sun Cluster HA for NetBackup is missing or does not have correct permissions.

Solution:

Ensure that the bp.conf file or a link to it exists under /usr/openv/netbackup and that the file has the correct permissions.

941267 : Cannot determine command passed in: <%s>.

Description:

An invalid pathname, displayed within the angle brackets, was passed to a libdsdev routine such as scds_timerun or scds_pmf_start. This could be the result of incorrectly configuring the name of a START or MONITOR_START method or other property, or a programming error made by the resource type developer.

Solution:

Supply a valid pathname to a regular, executable file.

941367 :open failed: %s

Description:

Failed to open /dev/console. The "open" man page describes possible error codes.

Solution:

None. ucmmd will exit.

941416 :One or more of the SUNW.HAStoragePlus resources that this resource depends on is not online anywhere.

Description:

It is an invalid configuration to create an application resource that depends on one or more SUNW.HAStoragePlus resource(s) that are not online on any node.

Solution:

Bring the SUNW.HAStoragePlus resource(s) online before creating the application resource that depend on them and then try the command again.

941693 :"%s" Failed to stay up.

Description:

The tag shown, being run by the rpc.pmfd server, has exited. Either the user has decided to stop monitoring this process, or the process exceeded the number of retries. An error message is output to syslog.

Solution:

This message is informational; no user action is needed.

941825 :A SCHA API error occurred while retrieving information on %s: %s.

Description:

SCHA API's are used to interface with the Resource Group Manager component. It is likely that the RGM is experiencing problems.

Solution:

Inspect the syslog for errors. Contact your authorized Sun service provider to determine whether a workaround or patch is available.

942307 :NAFO group %s has status %s.

Description:

The specified NAFO group is not in functional state. Logical host resource can't be started without a functional NAFO.

Solution:

LogicalHostname resource will not be brought online on this node. Check the messages(pnmd errors) that encountered just before this message for any NAFO or adapter problem. Correct the problem and rerun the scrgadm command.

944068 :clcomm: validate_policy: invalid relationship moderate %d high %d pool %d

Description:

The system checks the proposed flow control policy parameters at system startup and when processing a change request. The moderate server thread level cannot be higher than the high server thread level.

Solution:

No user action required.

944121 :Incorrect permissions set for %s

Description:

This file does not have the expected default execute permissions.

Solution:

Reset the permissions to allow execute permissions using the chmod command.

945717:Global service %s associated with path %s is found to be in the maintenence state.

Description:

A global (DCS) service is detected to be in the maintenence state. This global service is therefore assumed unavailable.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.

946660:Failed to create sap state file %s:%s Might put sap resource in stop-failed state.

Description:

If Sun Cluster HA for SAP is brought up outside the control of the Sun Cluster software, Sun Cluster HA for SAP should create the state file to signal the stop method not to try to stop Sun Cluster HA for SAP via the Sun Cluster software. If Sun Cluster HA for SAP was brought up outside of the Sun Cluster software, and the state file creation-failed, then the Sun Cluster HA for SAP resource might end in the stop-failed state when the Sun Cluster software tries to stop Sun Cluster HA for SAP. This is an internal error.

Solution:

Save the /var/adm/messages files from all nodes. Contact your authorized Sun service provider.

946873:The Host $i is not yet up.

Description:

The host specified is not running.

Solution:

Bring the resource group containing the specified host online, if it is not running. If the resource group is online, no user action required because the Sun Cluster HA for BroadVision One-To-One Enterprise Probe should take appropriate action.

947401 :reservation error(%s) - Unable to open device %s, error %d.

Description:

The device fencing program has encountered errors while trying to access a device. All retry attempts have failed.

Solution:

This may be indicative of a hardware problem, which should be resolved as soon as possible. Once the problem has been resolved, the following actions may be necessary: If the message specifies the 'node_join' transition, then this node may be unable to access the specified device. If the failure occurred during the 'release_shared_scsi2'transition, then a node which was joining the cluster may be unable to access the device. In either case, access can be reacquired by executing '/usr/cluster/lib/sc/run_reserve -c node_join' on all cluster nodes. If the failure occurred during the 'make_primary' transition, then a device group may have failed to start on this node. If the device group was started on another node, it may be moved to this node with the scswitch command. If the device group was not started, it may be started with the scswitch command. If the failure occurred during the 'primary_to_secondary' transition, then the shutdown or switchover of a device group may have failed. If so, the desired action may be retried.

948847 :ucm_callback for start_trans generated exception %d

Description:

ucmm callback for start transition failed. Step may have timedout.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.

949148 :"%s" requeued

Description:

The tag shown has exited and was restarted by the rpc.pmfd server. An error message is output to syslog.

Solution:

This message is informational; no user action is needed.

949565 :reservation error(%s) - do_scsi2_tkown() error for disk %s

Description:

The device fencing program has encountered errors while trying to access a device. All retry attempts have failed.

Solution:

The action which failed is a scsi-2 ioctl. These can fail if there are scsi-3 keys on the disk. To remove invalid scsi-3 keys from a device, use 'scdidadm -R' to repair the disk (see scdidadm man page for details). If there were no scsi-3 keys present on the device, then this error is indicative of a hardware problem, which should be resolved as soon as possible. Once the problem has been resolved, the following actions may be necessary: If the message specifies the 'node_join' transition, then this node may be unable to access the specified device. If the failure occurred during the 'release_shared_scsi2' transition, then a node which was joining the cluster may be unable to access the device. In either case, access can be reacquired by executing '/usr/cluster/lib/sc/run_reserve -c node_join' on all cluster nodes. If the failure occurred during the 'make_primary' transition, then a device group may have failed to start on this node. If the device group was started on another node, it may be moved to this node with the scswitch command. If the device group was not started, it may be started with the scswitch command. If the failure occurred during the 'primary_to_secondary' transition, then the shutdown or switchover of a device group may have failed. If so, the desired action may be retried.

949937 :Out of memory.

Description:

The data service has failed to allocate memory, most likely because the system has run out of swap space.

Solution:

The problem will probably cured by rebooting. If the problem reoccurs, you might need to increase swap space by configuring additional swap devices. See swap(1M) for more information.

950747 :resource %s monitor disabled.

Description:

This is a notification from the rgmd that the operator has disabled monitoring on a resource. This may be used by system monitoring tools.

Solution:

This is an informational message, no user action is needed.

950760 :reservation fatal error(%s) - get_resv_lock() error

Description:

The device fencing program has suffered an internal error.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available. Copies of /var/adm/messages from all nodes should be provided for diagnosis. It may be possible to retry the failed operation, depending on the nature of the error. If the message specifies the 'node_join' transition, then this node may be unable to access shared devices. If the failure occurred during the 'release_shared_scsi2' transition, then a node which was joining the cluster may be unable to access shared devices. In either case, it may be possible to reacquire access to shared devices by executing '/usr/cluster/lib/sc/run_reserve -c node_join' on all cluster nodes. If the failure occurred during the 'make_primary' transition, then a device group has failed to start on this node. If another node was available to host the device group, then it should have been started on that node. If desired, it may be possible to switch the device group to this node with the scswitch command. If no other node was available, then the device group will not have been started. The scswitch command may be used to retry the attempt to start the device group. If the failure occurred during the 'primary_to_secondary' transition, then the shutdown or switchover of a device group has failed. The desired action may be retried.

951501 :CCR: Could not initialize CCR data server.

Description:

The CCR data server could not initialize on this node. This usually happens when the CCR is unable to read its metadata entries on this node. There is a CCR data server per cluster node.

Solution:

There may be other related messages on this node, which may help diagnose this problem. If the root disk failed, it needs to be replaced. If there was cluster repository corruption, then the cluster repository needs to be restored from backup or other nodes in the cluster. Boot the offending node in -x mode to restore the repository. The cluster repository is located at /etc/cluster/ccr/.

951520:Validation failed. SYBASE ASE runserver file RUN_%s not found SYBASE=%s.

Description:

The Sybase Adaptive Server starts by specifying the Sybase Adaptive Server runserver file named RUN_<Server Name> located under $SYBASE/$SYBASE_ASE/install. This file is missing.

Solution:

Verify that the Sybase installation includes the runserver file and that permissions are correctly set on the file. The file should reside in the $SYBASE/$SYBASE_ASE/install directory.

951634 :INTERNAL ERROR CMM: clconf_get_quorum_table() returned error %d.

Description:

The node encountered an internal error during initialization of the quorum subsystem object.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.

951733 :Incorrect usage: %s.

Description:

The usage of the program was incorrect for the reason given.

Solution:

Use the correct syntax for the program.

952237 :Method <%s>: unknown command.

Description:

An internal error has occurred in the interface between the rgmd and fed daemons. This in turn will cause a method invocation to fail. This should not occur and may indicate an internal logic error in the rgmd.

Solution:

Look for other syslog error messages on the same node. Save a copy of the /var/adm/messages files on all nodes, and report the problem to your authorized Sun service provider.

952465 :HA: exception adding secondary

Description:

A failure occurred while attempting to add a secondary provider for an HA service.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.

953642 :Server is not running. Calling shutdown abort to clear shared memory (if any)

Description:

Informational message. Oracle server is not running. However if Oracle processes are aborted without clearing shared memory, it can cause problems when starting Oracle server. Clearing leftover shared memory if any.

Solution:

None

954497 :clcomm: Unable to find %s in name server

Description:

The specified entity is unknown to the name server.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.

955759 :Global service %s associated with path %s is found to be unavailable: %s.

Description:

An error occured while checking the availability of a global (DCS) service of a file. The global service is therefore assumed unavailable.

Solution:

Inspect the syslog for other errors. Contact your authorized Sun service provider to determine whether a workaround or patch is available.

956501 :Issuing a failover request.

Description:

This message indicates that the function is about to make a failover request to the RGM. If the request fails, refer to the syslog messages that appear after this message.

Solution:

This is an informational message, no user action is required.

957086 :Prog <%s> failed to execute step <%s> - error=<%d>

Description:

ucmmd failed to execute a step.

Solution:

Examine other syslog messages occurring at about the same time to see if the problem can be identified and if it recurs. Save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance.

958425 :clcomm: Cannot fork1() after cluster initialization

Description:

A user level process attempted to fork1 after cluster initialization. This is not allowed.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.

958832 :INTERNAL ERROR: monitoring is enabled, but MONITOR_STOP method is not registered for resource <%s>

Description:

A non-fatal internal error has occurred in the rgmd state machine.

Solution:

Since this problem might indicate an internal logic error in the rgmd, please save a copy of the /var/adm/messages files on all nodes, the output of an scstat -g command, and the output of a scrgadm -pvv command. Report the problem to your authorized Sun service provider.

958888 :clcomm: Failed to allocate simple xdoor client %d

Description:

The system could not allocate a simple xdoor client. This can happen when the xdoor number is already in use. This message is only possible on debug systems.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.

959384 :Possible syntax error in hosts entry in %s.

Description:

Validation callback method has failed to validate the hostname list. There may be syntax error in the nsswitch.conf file.

Solution:

Check for the following syntax rules in the nsswitch.conf file. 1) Check if the lookup order for "hosts" has "files". 2) "cluster" is the only entry that can come before "files". 3) Everything in between '[' and ']' is ignored. 4) It is illegal to have any leading whitespace character at the beginning of the line; these lines are skipped. Correct the syntax in the nsswitch.conf file and try again.

959433 :Failed to initialize DSDL.

Description:

An error occurred when initializing the RGM's DSDL library.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.

959610 :Property %s should have only one value.

Description:

A multi-valued (comma-separated) list was provided to the scrgadm command for the property, while the implementation supports only one value for this property.

Solution:

Specify a single value for the property on the scrgadm command.

960308 :clcomm: Pathend %p: remove_path called twice

Description:

The system maintains state information about a path. The remove_path operation is not allowed in this state.

Solution:

No user action is required.

960344 :ERROR: process_resource: resource <%s> is pending_init but no INIT method is registered

Description:

A non-fatal internal error has occurred in the rgmd state machine.

Solution:

960862 :(%s) sigaction failed: %s (UNIX errno %d)

Description:

The udlm has failed to initialize signal handlers by a call to sigaction(2). The error message indicates the reason for the failure.

Solution:

Save a copy of the /var/adm/messages files on all nodes, and of the rgmd core file. Contact your authorized Sun service provider for assistance in diagnosing the problem.

960932 :Switchover (%s) error: failed to fsck disk

Description:

The file system specified in the message could not be hosted on the node the message came from because an fsck on the file system revealed errors.

Solution:

Unmount the cluster file system (if mounted), fsck the device, and then mount the cluster file system again.

961551 :Signal %d terminated the scalable service configuration process.

Description:

An unexpected signal caused the termination of the program that configures the networking components for a scalable resource. This premature termination will cause the scalable service configuration to be aborted for this resource.

Solution:

Save a copy of the /var/adm/messages files on all nodes. If a core file was generated, submit the core to your service provider. Contact your authorized Sun service provider for assistance in diagnosing the problem.

962746 :Usage: %s [-c|-u] -R -T -G [-r sys_def_prop=values ...] [-x ext_prop=values ...].

Description:

Incorrect arguments are passed to the callback methods.

Solution:

This is an internal error. Contact your authorized Sun service provider for assistance in diagnosing and correcting the problem.

963465 :fatal: rpc_control() failed to set automatic MT mode; aborting node

Description:

The rgmd failed in a call to rpc_control(3N). This error should never occur. If it did, it would cause the failure of subsequent invocations of scha_cmds(1HA) and scha_calls(3HA). This would most likely lead to resource method failures and prevent RGM reconfigurations from occurring. The rgmd will produce a core file and will force the node to halt or reboot.

Solution:

Examine other syslog messages occurring at about the same time to see if the source of the problem can be identified. Save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance in diagnosing the problem. Reboot the node to restart the clustering daemons.

963755 :lkcm_cfg: caller is not registered

Description:

udlm is not registered with ucmm.

Solution:

This is an internal error. Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.

964072 :Unable to resolve %s.

Description:

The data service has failed to resolve the host information.

Solution:

If the logical host and shared address entries are specified in the /etc/inet/hosts file, check that these entries are correct. If this is not the reason, then check the health of the name server. For more error information, check the syslog messages.

964083 :t_open (open_cmd_port) failed

Description:

Call to t_open() failed. The "t_open" man page describes possible error codes. ucmmd will exit and the node will abort.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.

964399 :udlm seq no (%d) does not match library's (%d).

Description:

Mismatch in sequence numbers between udlm and the library code is causing an abort.

Solution:

This is an internal error. Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.

964521 :Failed to retrieve the resource handle: %s.

Description:

An API operation on the resource has failed.

Solution:

For the resource name, check the syslog tag. For more details, check the syslog messages from other components. If the error persists, reboot the node.

965873 :CMM: Node %s (nodeid = %d) with votecount = %d added.

Description:

The specified node with the specified votecount has been added to the cluster.

Solution:

This is an informational message, no user action is needed.

966245 :%d entries found in property %s. For a secure Netscape Directory Server instance %s should have one or two entries.

Description:

Since a secure Netscape Directory Server instance can listen on only one or two ports, the list property should have either one or two entries. A different number of entries was found.

Solution:

Change the number of entries to be either one or two.

966416 :This list element in System property %s has an invalid protocol: %s.

Description:

The system property that was named does not have a valid protocol.

Solution:

Change the value of the property to use a valid protocol.

966842 :in libsecurity unknown security flag %d

Description:

This is an internal error which shouldn't happen. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.

967050 :Validation failed. Listener binaries not found ORACLE_HOME=%s

Description:

Oracle listener binaries not found under ORACLE_HOME. ORACLE_HOME specified for the resource is indicated in the message. HA-Oracle will not be able to manage Oracle listener if ORACLE_HOME is incorrect.

Solution:

Specify correct ORACLE_HOME when creating resource. If resource is already created, please update resource property 'ORACLE_HOME'.

967080:This resource depends on a HAStoragePlus resource that is not online on this node. Ignoring validation errors.

Description:

The resource depends on a HAStoragePlus resource. Some of the files required for validation checks are not accessible from this node, because HAStoragePlus resource is not online on this node. Validations will be performed on the node that has the HAStoragePlus resource online. Validation errors are being ignored on this node by this callback method.

Solution:

Check the validation errors logged in the syslog messages. Please verify that these errors are not configuration errors.

967970 :Modification of resource <%s> failed because none of the nodes on which VALIDATE would have run are currently up

Description:

In order to change the properties of a resource whose type has a registered VALIDATE method, the rgmd must be able to run VALIDATE on at least one node. However, all of the candidate nodes are down. "Candidate nodes" are either members of the resource group's Nodelist or members of the resource type's Installed_nodes list, depending on the setting of the resource's Init_nodes property.

Solution:

Boot one of the resource group's potential masters and retry the resource change operation.

968426:arguments to bv_utils are $*.

Description:

This is a debug message. The make scmsgs command does not seem to recognize the debug flag in scds_syslog command line utility.

Solution:

No user action required.

968557 :Could not unplumb any ip addresses.

Description:

Failed to unplumb any ip addresses. The resource cannot be brought offline. Node will be rebooted by Sun cluster.

Solution:

Check the syslog messages from other components for possible root cause. Save a copy of /var/adm/messages and contact Sun service provider for assistance in diagnosing and correcting the problem.

968853 :scha_resource_get error (%d) when reading system property %s

Description:

Error occurred in API call scha_resource_get.

Solution:

Check syslog messages for errors logged from other system modules. Stop and start fault monitor. If error persists then disable fault monitor and report the problem.

969008 :t_alloc (open_cmd_port-T_ADDR) %d

Description:

Call to t_alloc() failed. The "t_alloc" man page describes possible error codes. ucmmd will exit and the node will abort.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.

969827 :Failover attempt has failed.

Description:

The failover attempt of the resource is rejected or encountered an error.

Solution:

For more detailed error message, check the syslog messages. Check whether the Pingpong_interval has appropriate value. If not, adjust it using scrgadm(1M). Otherwise, use scswitch to switch the resource group to a healthy node.

970232:Validation failed. SYBASE ASE stop server file %s not found.

Description:

Sun Cluster HA for Sybase executes the stopserver script to stop all servers. This file is specified by the Stop_File extension property. This file might be missing.

Solution:

Verify that the Stop_File extension property correctly specifies the stopserver file.

970912 :execve: %s

Description:

The rpc.pmfd server was not able to exec a new process, possibly due to bad arguments. The message contains the system error. The server does not perform the action requested by the client, and an error message is output to syslog.

Solution:

Investigate that the file path to be executed exists. If all looks correct, save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.

971233 :Property %s is not set.

Description:

The property has not been set by the user and must be.

Solution:

Reissue the scrgadm command with the required property and value.

971412 :Error in getting global service name for path <%s>

Description:

The path can not be mapped to a valid service name.

Solution:

Check the path passed into extension property "ServicePaths" of SUNW.HAStorage type resource.

972580 :CCR: Highest epoch is < 0, highest_epoch = %d.

Description:

The epoch indicates the number of times a cluster has come up. It should not be less than 0. It could happen due to corruption in the cluster repository.

Solution:

Boot the cluster in -x mode to restore the cluster repository on all the members of the cluster from backup. The cluster repository is located at /etc/cluster/ccr/.

972610 :fork: %s

Description:

The rgmd, rpc.pmfd or rpc.fed daemon was not able to fork a process, possibly due to low swap space. The message contains the system error. This can happen while the daemon is starting up (during the node boot process), or when executing a client call. If it happens when starting up, the daemon does not come up. If it happens during a client call, the server does not perform the action requested by the client.

Solution:

Investigate if the machine is running out of swap space. If this is not the case, save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.

972716:Failed to stop the application with SIGKILL. Returning with failure from stop method.

Description:

The stop method failed to stop the application with SIGKILL.

Solution:

Use pmfadm(1M) with the -L option to retrieve all the tags that are running on the server. Identify the tag name for the application in this resource. This can be easily identified as the tag ends in the string ".svc" and contains the resource group name and the resource name. Then use pmfadm(1M) with the option to stop the application. If the error persists, then reboot the node.

972908:Unable to get the name of the local cluster node: %s.

Description:

An internal error occurred while attempting to obtain the local cluster nodename.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.

973308:Starting Sybase adaptive server.

Description:

Sun Cluster HA for Sybase is going to start the Sybase Adaptive Server.

Solution:

No user action required.

973615 :Node %s: weight %d

Description:

The load balancer set the specified weight for the specified node.

Solution:

This is an informational message, no user action is needed.

973933 :resource %s added.

Description:

This is a notification from the rgmd that the operator has created a new resource. This may be used by system monitoring tools.

Solution:

This is an informational message, no user action is needed.

974106 :lkcm_parm: caller is not registered

Description:

udlm is not registered with ucmm.

Solution:

This is an internal error. Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.

974129 :Cannot stat %s: %s.

Description:

The stat(2) system call failed on the specified pathname, which was passed to a libdsdev routine such as scds_timerun or scds_pmf_start. The reason for the failure is stated in the message. The error could be the result of 1) incorrectly configuring the name of a START or MONITOR_START method or other property, 2) a programming error made by the resource type developer, or 3) a problem with the specified pathname in the file system itself.

Solution:

Ensure that the pathname refers to a regular, executable file.

974664 :HA: no valid secondary provider in rmm - aborting

Description:

This node joined an existing cluster. Then all of the other nodes in the cluster died before the HA framework components on this node could be properly initialized.

Solution:

This node must be rebooted.

976495 :fork failed: %s

Description:

Failed to run the "fork" command. The "fork" man page describes possible error codes.

Solution:

Some system resource has been exceeded. Install more memory, increase swap space or reduce peak memory consumption.

977371 :Backup server terminated.

Description:

Graceful shutdown did not succeed. Backup server processes were killed in STOP method. It is likely that adaptive server terminated prior to shutdown of backup server.

Solution:

Please check the permissions of file specified in the STOP_FILE extension property. File should be executable by the Sybase owner and root user

978125 :in libsecurity setnetconfig failed when initializing the server: %s - %s

Description:

A server (rpc.pmfd, rpc.fed or rgmd) was not able to start because it could not establish a rpc connection for the network specified. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.

978534 :%s: lookup failed.

Description:

Could not get the hostname for a node. This could also be because the node is not booted as part of a cluster.

Solution:

Make sure the node is booted as part of a cluster.

978829 :t_bind, did not bind to desired addr

Description:

Call to t_bind() failed. The "t_bind" man page describes possible error codes. udlm will exit and the node will abort.

Solution:

This is an internal error. Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.

979343 :Error: duplicate prog <%s> launched step <%s>

Description:

Due to an internal error, uccmd has attempted to launch the same step by duplicate programs. ucmmd will reject the second program and treat it as a step failure.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.

979578 :Error when analysing the device special file associated with file system mount point %s.

Description:

The device special file associated with a file system mount point may not be a valid DCS device. Only global devices can be used as device special files for specifying local file system mount points.

Solution:

Inspect the syslog for other errors. Contact your authorized Sun service provider to determine whether a workaround or patch is available.

979803 :CMM: Node being shut down.

Description:

This node is being shut down.

Solution:

This is an informational message, no user action is needed.

980307 :reservation fatal error(%s) - Illegal command

Description:

The device fencing program has suffered an internal error.

Solution:

980425:Aborting startup: could not determine whether failover of NFS resource groups is in progress.

Description:

Startup of an NFS resource was aborted because it was not possible to determine if a failover of any NFS resource groups is in progress.

Solution:

Contact your authorized Sun service provider for assistance. Provide your authorized Sun service provider a copy of the /var/adm/messages files from all nodes.

980477 :LogicalHostname online.

Description:

The status of the logicalhost resource is online.

Solution:

This is informational message. No user action required.

980681 :clconf: CSR removal failed

Description:

While executing task in clconf and modifying the state of proxy, received failure from trying to remove CSR.

Solution:

This is informational message. No user action required.

980942 :CMM: Cluster doesn't have operational quorum yet; waiting for quorum.

Description:

Not enough nodes are operational to obtain a majority quorum; the cluster is waiting for more nodes before starting.

Solution:

If nodes are booting, wait for them to finish booting and join the cluster. Boot nodes that are down.

981739:CCR: Updating invalid table %s.

Description:

This joining node carries a valid copy of the indicated table with override flag set while the current cluster membership doesn't have a valid copy of this table. This node will update its copy of the indicated table to other nodes in the cluster.

Solution:

This is an informational message, no user action is needed.

981931 :INTERNAL ERROR: postpone_start_r: meth type <%d>

Description:

A non-fatal internal error has occurred in the rgmd state machine.

Solution:

983305 :clconf: Failed to open table infrastructure in unregister_infr_callback.

Description:

Failed to open table infrastructure in unregistered clconf callback with CCR. Table infrastructure not found.

Solution:

Check the table infrastructure.

984704 :reset_rg_state: unable to change state of resource group <%s> on node <%d>; assuming that node died

Description:

The rgmd was unable to reset the state of the specified resource group to offline on the specified node, presumably because the node died.

Solution:

Examine syslog output on the specified node to determine the cause of node death. The syslog output might indicate further remedial actions.

985111 :lkcm_reg: illegal %s value

Description:

Cluster information that is being used during udlm registration with ucmm is incorrect.

Solution:

This is an internal error. Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.

986190 :Entry at position %d in property %s with value %s is not a valid node identifier or node name.

Description:

The value given for the named property has an invalid node specified for it. The position index, which starts at 0 for the first element in the list, indicates which element in the property list was invalid.

Solution:

Specify a valid node for the property.

986197 :reservation fatal error(%s) - malloc() error, errno %d

Description:

The device fencing program has been unable to allocate required memory.

Solution:

Memory usage should be monitored on this node and steps taken to provide more available memory if problems persist. Once memory has been made available, the following steps may need to taken: If the message specifies the 'node_join' transition, then this node may be unable to access shared devices. If the failure occurred during the 'release_shared_scsi2' transition, then a node which was joining the cluster may be unable to access shared devices. In either case, access to shared devices can be reacquired by executing '/usr/cluster/lib/sc/run_reserve -c node_join' on all cluster nodes. If the failure occurred during the 'make_primary' transition, then a device group has failed to start on this node. If another node was available to host the device group, then it should have been started on that node. The device group can be switched back to this node if desired by using the scswitch command. If no other node was available, then the device group will not have been started. The scswitch command may be used to start the device group. If the failure occurred during the 'primary_to_secondary' transition, then the shutdown or switchover of a device group has failed. The desired action may be retried.

986466 :clexecd: stat of '%s' failed

Description:

clexecd problem failed to stat the directory indicated in the error message.

Solution:

Make sure the directory exists. If the problem persists, contact your authorized Sun service provider to determine whether a workaround or patch is available.

987455 :in libsecurity weak Unix authorization failed

Description:

A server (rgmd) refused an rpc connection from a client because it failed the Unix authentication. This happens if a caller program using scha public api, either in its C form or its CLI form, is not running as root. An error message is output to syslog.

Solution:

Check that the calling program using the scha public api is running as root. If the program is running as root, save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.

987601 :scvxvmlg error - opendir(%s) failed

Description:

The program responsible for maintaining the VxVM namespace was unable to access the global device namespace. If configuration changes were recently made to VxVM diskgroups or volumes, this node may be unaware of those changes. Recently created volumes may be inaccessible from this node.

Solution:

Verify that the /global/.devices/node@N (N = this node's node number) is mounted globally and is accessible. If no configuration changes have been recently made to VxVM diskgroups or volumes and all volumes continue to be accessible from this node, then no further action is required. If changes have been made, the device namespace on this node can be updated to reflect those changes by executing '/usr/cluster/lib/dcs/scvxvmlg'. If the problem persists, contact your authorized Sun service provider to determine whether a workaround or patch is available.

988416 :t_sndudata (2) in send_reply: %s

Description:

Call to t_sndudata() failed. The "t_sndudata" man page describes possible error codes.

Solution:

None.

988719 :Warning: Unexpected result returned while checking for the existence of scalable service group %s: %d.

Description:

A call to the underlying scalable networking code failed.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.

988937 :Extension property <failover_enabled> has a value of <%d>

Description:

Resource property failover_enabled is set to a value or has a default value. The value '1' means TRUE and '0' means FALSE.

Solution:

This is an informational message, no user action is needed.

989693 :thr_create failed

Description:

Could not create a new thread. The "thr_create" man page describes possible error codes.

Solution:

Some system resource has been exceeded. Install more memory, increase swap space or reduce peak memory consumption.

989846 :ERROR: unpack_rg_seq(): rgname_to_rg failed <%s>

Description:

Due to an internal error, the rgmd was unable to find the specified resource group data in memory.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.

990215 :HA: repl_mgr: exception while invoking RMA reconf object

Description:

An unrecoverable failure occurred in the HA framework.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.

990418 :received signal %d

Description:

The daemon indicated in the message tag (rgmd or ucmmd) has received a signal, possibly caused by an operator-initiated kill(1) command. The signal is ignored.

Solution:

The operator must use scswitch(1M) and shutdown(1M) to take down a node, rather than directly killing the daemon.

991108 :uaddr2taddr (open_cmd_port) failed

Description:

Call to uaddr2taddr() failed. The "uaddr2taddr" man page describes possible error codes. ucmmd will exit and the node will abort.

Solution:

Save the files /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.

991130 :pthread_create: %s

Description:

The rpc.pmfd server was not able to allocate a new thread, probably due to low memory, and the system error is shown. This can happen when a new tag is started, or when monitoring for a process is set up. If the error occurs when a new tag is started, the tag is not started and pmfadm returns error. If the error occurs when monitoring for a process is set up, the process is not monitored. An error message is output to syslog.

Solution:

Investigate if the machine is running out of memory. If this is not the case, save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.

991219:Failed to stop HA-NetBackup master: command %s failed.

Description:

Failed to the stop Sun Cluster HA for NetBackup processes.

Solution:

Contact your authorized Sun service provider for assistance. Provide your authorized Sun service provider a copy of the /var/adm/messages files from all nodes.

991800 :in libsecurity transport %s is not a loopback transport

Description:

A server (rpc.pmfd, rpc.fed or rgmd) refused an rpc connection from a client because the named transport is not a loopback. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.

991864 :putenv: %s

Description:

The rpc.pmfd server was not able to change environment variables. The message contains the system error. The server does not perform the action requested by the client, and an error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.

992912 :clexecd: thr_sigsetmask returned %d. Exiting.

Description:

clexecd program has encountered a failed thr_sigsetmask(3THR) system call. The error message indicates the error number for the failure.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.

992998 :clconf: CSR registration failed

Description:

While executing task in clconf and modifying the state of proxy, received failure from registering CSR.

Solution:

This is informational message. No user action required.

995026 :lkcm_cfg: invalid handle was passed %s %d

Description:

Handle for communication with udlmctl during a call to return the current DLM configuration is invalid.

Solution:

This is an internal error. Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.

995339 :Restarting using scha_control RESTART

Description:

Fault monitor has detected problems in RDBMS server. Attempt will be made to restart RDBMS server on the same node.

Solution:

Check the cause of RDBMS failure.

996075 :fatal: Unable to resolve %s from nameserver

Description:

The low-level cluster machinery has encountered a fatal error. The rgmd will produce a core file and will cause the node to halt or reboot to avoid the possibility of data corruption.

Solution:

Save a copy of the /var/adm/messages files on all nodes, and of the rgmd core file. Contact your authorized Sun service provider for assistance in diagnosing the problem.

996369 :PNM: successfully started

Description:

The PNM daemon has successfully started. Adapter monitoring and failover are enabled.

Solution:

This message is informational; no user action is needed.

996897 :Method <%s> on resource <%s>: stat of program file failed.

Description:

The rgmd was unable to access the indicated resource method file. This may be caused by incorrect installation of the resource type.

Solution:

Consult resource type documentation; [re-]install the resource type, if necessary.

996902 :Stopped the HA-NFS system fault monitor.

Description:

The HA-NFS system fault monitor was stopped successfully.

Solution:

No action required.

997689 :IP address %s is an IP address in resource %s and in resource %s.

Description:

The same IP address is being used in two resources. This is not a correct configuration.

Solution:

Delete one of the resources that is using the duplicated IP address.

998022 :Failed to restart the service: %s.

Description:

Restart attempt of the data service has failed.

Solution:

Check the syslog messages that are occurred just before this message to check whether there is any internal error. In case of internal error, contact your Sun service provider. Otherwise, any of the following situations may have happened. 1) Check the Start_timeout and Stop_timeout values and adjust them if they are not appropriate. 2) This might be the result of lack of the system resources. Check whether the system is low in memory or the process table is full and take appropriate action.

998473:Cannot remove file %s/%s.mrg.

Description:

The file $MONSERVER_SHM_DIR/$MONITOR_SERVER_NAME.mrg is used by the Monitor Server to store information about Solaris IPC objects. Graceful shutdowns result in automatic deletion of this file. Sun Cluster HA for Sybase attempts to remove this file prior to the Monitor Server startup/shutdown process and logs this error message if it encounters an error.

Solution:

Remove the file using the root account, if necessary.

Previous: Chapter 9 Message IDs 800000 - 899999