Sun Cluster 3.1 Error Messages Guide

Chapter 2 Error Messages

This chapter contains a numeric listing of Sun Cluster 3.1 error messages and descriptions. The following sections are contained in this chapter.

Message IDs 100000–199999


100088 fatal: Got error <%d> trying to read CCR when making resource group <%s> managed; aborting node

Description:

Rgmd failed to read updated resource from the CCR on this node.

Solution:

Save a copy of the /var/adm/messages files on all nodes, and of the rgmd core file. Contact your authorized Sun service provider for assistance in diagnosing the problem.


100293 dl_bind: kstr_msg failed %d error

Description:

Could not bind to the private interconnect.

Solution:

Reboot of the node might fix the problem.


100396 clexecd: unable to arm failfast.

Description:

clexecd problem could not enable one of the mechanisms which causes the node to be shutdown to prevent data corruption, when clexecd program dies.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


100555 libsecurity: getnetconfigent error: %s

Description:

A client of the rpc.pmfd, rpc.fed or rgmd server was not able to initiate an rpc connection, because it could not get the network information. The pmfadm or scha command exits with error. The rpc error is shown. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


102218 couldn't initialize ORB, possibly because machine is booted in non-cluster mode

Description:

could not initialize ORB.

Solution:

Please make sure the nodes are booted in cluster mode.


102340 Prog <%s> step <%s>: authorization error.

Description:

An attempted program execution failed, apparently due to a security violation; this error should not occur. This failure is considered a program failure.

Solution:

Correct the problem identified in the error message. If necessary, examine other syslog messages occurring at about the same time to see if the problem can be diagnosed. Save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance in diagnosing the problem.


102779 %s has been removed from %s. Make sure that all HA IP addresses hosted on %s are moved.

Description:

We do not allow removing of an adapter from an IPMP group. The correct way to DR an adapter is to use if_mpadm(1M). Therefore we notify the user of the potential error.

Solution:

This message is informational; no user action is needed if the DR was done properly (using if_mpadm).


103566 %s is not an absolute path.

Description:

The extension property listed is not an absolute path.

Solution:

Make sure the path starts with '/'.


104035 Failed to start sap processes with command %s.

Description:

Failed to start up SAP with the specified command.

Solution:

SAP Central Instance failed to start on this cluster node. It would be started on some other cluster node provided there is another cluster node available. If the Central Instance failed to start on any other node, disable the SAP Central Instance resource, then try to run the same command manually, and fix any problem found. Save the /var/adm/messages from all nodes. Contact your authorized Sun service provider.


104914 CCR: Failed to set epoch on node %s errno = %d.

Description:

The CCR was unable to set the epoch number on the indicated node. The epoch was set by CCR to record the number of times a cluster has come up. This information is part of the CCR metadata.

Solution:

There may be other related messages on the indicated node, which may help diagnose the problem, for example: If the root file system is full on the node, then free up some space by removing unnecessary files. If the root disk on the afflicted node has failed, then it needs to be replaced.


105040 'dbmcli' failed in command %s.

Description:

SAP utililty 'dbmcli -d <LC_NAME> -n <logical hostname> db_state' failed to complete as user <lc-name>adm.

Solution:

Check the SAP liveCache installation and SAP liveCache log files for reasons that might cause this. Make sure the cluster nodes are booted up in 64-bit since liveCache only runs on 64-bit. If this error caused the SAP liveCache resource to be in any error state, use SAP liveCache utility to stop and dbclean the SAP liveCache database first, before trying to start it up again.


105222 Waiting for %s to startup

Description:

Waiting for the application to startup.

Solution:

This message is informational; no user action is needed.


105337 WARNING: thr_getspecific %d

Description:

The rgmd has encountered a failed call to thr_getspecific(3T). The error message indicates the reason for the failure. This error is non-fatal.

Solution:

If the error message is not self-explanatory, contact your authorized Sun service provider for assistance in diagnosing the problem.


105450 Validation failed. ASE directory %s does not exist.

Description:

The Adaptive Server Environment directory does not exist. TheSYBASE_ASE environment variable may be incorrectly set or theinstallation may be incorrect.

Solution:

Check the SYBASE_ASE environment variable value and verify the Sybaseinstallation.


106181 WARNING: lkcm_act: %d returned from udlm_recv_message (the error was successfully masked from upper layers).

Description:

Unexpected error during a poll for dlm messages.

Solution:

This is an internal error. Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.


108357 lookup: unknown binding type <%d>

Description:

During a name server lookup an unknown binding type was encountered.

Solution:

No action required. This is informational message.


108990 CMM: Cluster members: %s.

Description:

This message identifies the nodes currently in the cluster.

Solution:

This is an informational message, no user action is needed.


109102 %s should be larger than %s.

Description:

The value of Thorough_Probe_Interval specified in scrgadm command or in CCR table was smaller than Cheap_Probe_Interval.

Solution:

Reissue the scrgadm command with appropriate values as indicated.


109105 (%s) setitimer failed: %d: %s (UNIX errno %d)

Description:

Call to setitimer() failed. The "setitimer" man page describes possible error codes. udlmctl will exit.

Solution:

Save a copy of the /var/adm/messages files on all nodes, and of the rgmd core file. Contact your authorized Sun service provider for assistance in diagnosing the problem.


110012 lkcm_dreg failed to communicate to CMM ... will probably failfast: %s

Description:

Could not deregister udlm from ucmm. This node will probably failfast.

Solution:

This is an internal error. Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.


110097 Major number for driver (%s) does not match the one on other nodes.

Description:

The driver identified in this message does not have the same major number across cluster nodes, and devices owned by the driver are being used in global device services.

Solution:

Look in the /etc/name_to_major file on each cluster node to see if the major number for the driver matches across the cluster. If a driver is missing from the /etc/name_to_major file on some of the nodes, then most likely, the package the driver ships in was not installed successfully on all nodes. If this is the case, install that package on the nodes that don't have it. If the driver exists on all nodes but has different major numbers, see the documentation that shipped with this product for ways to correct this problem.


111527 Method <%s> on resource <%s>: unknown command.

Description:

An internal logic error in the rgmd has prevented it from successfully executing a resource method.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


111697 Failed to delete scalable service in group %s for IP %s Port %d%c%s: %s.

Description:

A call to the underlying scalable networking code failed.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


111797 sigaction: %s

Description:

The rpc.pmfd server was not able to set its signal handler. The message contains the system error. This happens while the server is starting up, at boot time. The server does not come up, and an error message is output to syslog.

Solution:

Save the syslog messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


112872 No permission for group to execute %s.

Description:

The specified path does not have the correct permissions as expected by a program.

Solution:

Set the permissions for the file so that it is readable and executable by the group.


113620 Can't create kernel thread

Description:

Failed to create a crucial kernel thread for client affinity processing on the node.

Solution:

If client affinity is a requirement for some of the sticky services, say due to data integrity reasons, the node should be restarted.


114036 clexecd: Error %d from putmsg

Description:

clexecd program has encountered a failed putmsg(2) system call. The error message indicates the error number for the failure.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


114550 Unable to create <%s>: %s.

Description:

The HA-NFS stop method attempted to create the specified file but failed.

Solution:

Check the error message for the reason of failure and correct the situation. If unable to correct the situation, reboot the node.


114568 Adaptive server successfully started.

Description:

Sybase Adaptive server has been successfully started by SunCluster HA for Sybase.

Solution:

This is an information message, no user action is needed.


115256 file specified in USER_ENV %s doesn't exist

Description:

'User_env' property was set when configuring the resource. File specified in 'User_env' property does not exist or is not readable. File should be specified with fully qualified path.

Solution:

Specify existing file with fully qualified file name when creating resource. If resource is already created, please update resource property 'User_env'.


115461 in libsecurity __rpc_get_local_uid failed

Description:

A server (rpc.pmfd, rpc.fed or rgmd) refused an rpc connection from a client because it failed the Unix authentication, because it is not making the rpc call over the loopback interface. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


115987 execvp: %s

Description:

The rpc.pmfd server was not able to exec a new process, possibly due to bad arguments. The message contains the system error. The server does not perform the action requested by the client, and an error message is output to syslog.

Solution:

Investigate that the file path to be executed exists. If all looks correct, save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


116312 Unable to determine password for broker %s. Sending SIGKILL now.

Description:

The STOP method was unable to determine what the password was to shutdown the broker. The STOP method will send SIGKILL to shut it down.

Solution:

Check that the scs1mqconfig file is accessible and correctly specifies the password.


116499 Stopping liveCache times out with command %s.

Description:

Stopping liveCache timed out.

Solution:

Look for syslog error messages on the same node. Save a copy of the /var/adm/messages files on all nodes, and report the problem to your authorized Sun service provider.


116910 Unable to connect to Siebel database.

Description:

Siebel database may be unreachable.

Solution:

Please verify that the Siebel database resource is up.


117498 scha_resource_get error (%d) when reading extension property %s

Description:

Error occurred in API call scha_resource_get.

Solution:

Check syslog messages for errors logged from other system modules. Stop and start fault monitor. If error persists then disable fault monitor and report the problem.


117770 Hostname %s is already plumbed.

Description:

An attempt was made to create a Network resource with the specified hostname, while the hostname was already plumbed on a cluster node.

Solution:

Specify a unique hostname on the cluster. It should be a valid hostname in /etc/inet/hosts file, should be on a subnet which is available on the cluster and this hostname should not be in use on any cluster node.


118046 rebalance: no primary node could be found for resource group <%s>.

Description:

The rgmd is unable to bring the resource group online because all of its potential masters are down.

Solution:

Repair and reboot broken nodes so they may rejoin the cluster; or use scrgadm(1M) to edit the Nodelist property of the resource group so that it includes nodes that are cluster members.


118205 Script lccluster is not executable.

Description:

Script 'lccluster' is not executable.

Solution:

Make sure 'lccluster' is executable.


118261 Successfully stopped the service %s.

Description:

Specified data service stopped successfully.

Solution:

None. This is only an informational message.


119120 clconf: Key length is more than max supported length in clconf_ccr read

Description:

In reading configuration data through CCR, found the key length is more than max supported length.

Solution:

Check the CCR configuration information.


119649 clcomm: Unregister of pathend state proxy failed

Description:

The system failed to unregister the pathend state proxy.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


120470 (%s) t_sndudata: tli error: %s

Description:

Call to t_sndudata() failed. The "t_sndudata" man page describes possible error codes. udlmctl will exit.

Solution:

Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.


120714 Error retrieving the resource property %s: %s.

Description:

An error occured reading the indicated property.

Solution:

Check syslog messages for errors logged from other system modules. If error persists, please report the problem.


120972 IPMP Failure.

Description:

The IPMP group hosting the LogicalHostname has failed.

Solution:

The LogicalHostname resource would be failed over to a different node. If that fails, check the system logs for other messages. Also, correct the networking problem on the node so that the IPMP group in question is healthy again.


121513 Successfully restarted service.

Description:

This message indicates that the RGM successfully restarted the resource.

Solution:

This is an informational message, no user action is required.


121858 tag %s: not suspended, cannot resume

Description:

The user sent a resume command to the rpc.fed server for a tag that is not suspended. An error message is output to syslog.

Solution:

Check the tag name.


122838 Error deleting PidLog <%s> (%s) for service with config file <%s>.

Description:

The resource was not able to remove the application's PidLog before starting it.

Solution:

Check that PidLog is set correctly and that the PidLog file is accessible. If needed delete the PidLog file manually and start the the resource group.


122154 Failed to parse xml: no events

Description:

Solution:


122160 Unable to write to file %s: %s.


122188 IPMP logical interface configuration operation failed with <%d>.


122638 lockd is not runing. Will retry in 2 seconds

Description:

HA-NFS started lockd, but lockd could not start.

Solution:

This is an informative message. HA-NFS will attempt to restart lockd.


123526 Prog <%s> step <%s>: Execution failed: no such method tag.

Description:

An internal error has occurred in the rpc.fed daemon which prevents step execution. This is considered a step failure.

Solution:

Examine other syslog messages occurring at about the same time to see if the problem can be identified. Save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance in diagnosing the problem. Re-try the edit operation.


123984 All specified global device services are available.

Description:

All global device services specified directly or indirectly via the GlobalDevicePath and FilesystemMountPoint extension properties respectively are found to be available i.e up and running.


124232 clcomm: solaris xdoor fcntl failed: %s

Description:

A fcntl operation failed. The "fcntl" man page describes possible error codes.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


124810 fe_method_full_name() failed for resource <%s>, resource group <%s>, method <%s>

Description:

Due to an internal error, the rgmd was unable to assemble the full method pathname. This is considered a method failure. Depending on which method was being invoked and the Failover_mode setting on the resource, this might cause the resource group to fail over or move to an error state.

Solution:

Examine other syslog messages occurring at about the same time to see if the problem can be identified. Save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance in diagnosing the problem.


124935 Either extension property <Child_mon_level> is not defined, or an error occurred while retrieving this property; using the default value of -1.

Description:

Property Child_mon_level may not be defined in RTR file. Use the default value of -1.

Solution:

This is an informational message, no user action is needed.


125159 Load balancer setting distribution on %s:

Description:

The load balancer is setting the distribution for the specified service group.

Solution:

This is an informational message, no user action is needed.


125356 Failed to connect to %s:%d: %s.

Description:

The data service fault monitor probe was trying to connect to the host and port specified and failed. There may be a prior message in syslog with further information.

Solution:

Make sure that the port configuration for the data service matches the port configuration for the underlying application.


126142 fatal: new_str strcpy: %s (UNIX error %d)

Description:

The rgmd failed to allocate memory, most likely because the system has run out of swap space. The rgmd will produce a core file and will force the node to halt or reboot to avoid the possibility of data corruption.

Solution:

The problem is probably cured by rebooting. If the problem recurs, you might need to increase swap space by configuring additional swap devices. See swap(1M) for more information.


126318 fatal: Unknown object type bound to %s

Description:

The low-level cluster machinery has encountered a fatal error. The rgmd will produce a core file and will cause the node to halt or reboot to avoid the possibility of data corruption.

Solution:

Save a copy of the /var/adm/messages files on all nodes, and of the rgmd core file. Contact your authorized Sun service provider for assistance in diagnosing the problem.


126467 HA: not implemented for userland

Description:

An invocation was made on an HA server object in user land. This is not currently supported.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


127182 fatal: thr_create returned error: %s (UNIX error %d)

Description:

The rgmd failed in an attempt to create a thread. The rgmd will produce a core file and will force the node to halt or reboot to avoid the possibility of data corruption.

Solution:

Fix the problem described by the UNIX error message. The problem may have already been corrected by the node reboot.


127411 Error in reading /etc/mnttab: getmntent() returns <%d>

Description:

Failed to read /etc/mnttab.

Solution:

Check with system administrator and make sure /etc/mnttab is properly defined.


127624 must be superuser to start %s

Description:

Process ucmmd did not get started by superuser. ucmmd is going to exit now.

Solution:

None. This is an internal error.


129832 Incorrect syntax in Environment_file. Ignoring %s

Description:

Incorrect syntax in Environment_file. Correct syntax is: VARIABLE=VALUE

Solution:

Please check the Environment_file and correct the syntax errors.


130822 CMM: join_cluster: failed to register ORB callbacks with CMM.

Description:

The system can not continue when callback registration fails.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


131492 pxvfs::mount(): global mounts are not enabled (need to run "clconfig -g" first)

Description:

A global mount command is attempted before the node has initialized the global file system name space. Typically this caused by trying to perform a global mount while the system is booted in single user mode.

Solution:

If the system is not at run level 2 or 3, change to run level 2 or 3 using the init(1M) command. Otherwise, check message logs for errors during boot.


132032 clexecd: strdup returned %d. Exiting.

Description:

clexecd program has encountered a failed strdup(3C) system call. The error message indicates the error number for the failure.

Solution:

If the error number is 12 (ENOMEM), install more memory, increase swap space, or reduce peak memory consumption. If error number is something else, contact your authorized Sun service provider to determine whether a workaround or patch is available.


134167 Unable to set maximum number of rpc threads.

Description:

The rpc.pmfd server was not able to set the maximum number of rpc threads. This happens while the server is starting up, at boot time. The server does not come up, and an error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


134411 %s can't unplumb

Description:

This means that the Logical IP address could not be unplumbed from an adapter.

Solution:

There could be other related error messages which might be helpful. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


134417 Global service <%s> of path <%s> is in maintainance.

Description:

Service is not supported by HA replica.

Solution:

Resume the service by using scswitch(1m).


135918 CMM: Quorum device %ld (%s) added; votecount = %d, bitmask of nodes with configured paths = 0x%llx.

Description:

The specified quorum device with the specified votecount and configured paths bitmask has been added to the cluster. The quorum subsystem treats a quorum device in maintenance state as being removed from the cluster, so this message will be logged when a quorum device is taken out of maintenance state as well as when it is actually added to the cluster.

Solution:

This is an informational message, no user action is needed.


136330 This resource depends on a HAStoragePlus resouce that is not online. Unable to perform validations.

Description:

The resource depends on a HAStoragePlus resource that is not online on any node. Some of the files required for validation checks are not accessible. Validations cannot be performed on any node.

Solution:

Enable the HAStoragePlus resource that this resource depends on and reissue the command.


136955 Failed to retrieve main dispatcher pid.

Description:

Failed to retrieve the process ID for the main dispatcher process indicating the main dispatcher process is not running.

Solution:

No action needed. The fault monitor will detect this and take appropriate action.


137294 method_full_name: strdup failed

Description:

The rgmd server was not able to create the full name of the method, while trying to connect to the rpc.fed server, possibly due to low memory. An error message is output to syslog.

Solution:

Investigate if the host is running out of memory. If not save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


137606 clcomm: Pathend %p: disconnect_node not allowed

Description:

The system maintains state information about a path. The disconnect_node operation is not allowed in this state.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


138972 could not set timeout: %s

Description:

A client was not able to make an rpc connection to a server (rpc.pmfd, rpc.fed or rgmd) because it could not set the rpc call timeout. The rpc error is shown. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


139773 clexecd: Error %d from strdup

Description:

clexecd program has encountered a failed strdup(3C) system call. The error message indicates the error number for the failure.

Solution:

If the error number is 12 (ENOMEM), install more memory, increase swap space, or reduce peak memory consumption. If error number is something else, contact your authorized Sun service provider to determine whether a workaround or patch is available.


139852 pmf_set_up_monitor: pmf_add_triggers: %s

Description:

The rpc.pmfd server was not able to monitor a process, and the system error is shown. An error message is output to syslog.

Solution:

Save the syslog messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


140225 The request to relocate resource %s completed successfully.

Description:

The resource named was relocated to a different node.

Solution:

This is an informational message, no user action is needed.


141062 Failed to connect to host %s and port %d: %s.

Description:

An error occurred while fault monitor attempted to probe the health of the data service.

Solution:

Wait for the fault monitor to correct this by doing restart or failover. For more error description, look at the syslog messages.


141236 Failed to format stringarray for property %s from value %s.

Description:

The validate method for the scalable resource network configuration code was unable to convert the property information given to a usable format.

Solution:

Verify the property information was properly set when configuring the resource.


141242 HA: revoke not implemented for replica_handler

Description:

An attempt was made to use a feature that has not been implemented.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


141970 in libsecurity caller has bad uid: get_local_uid=%d authsys=%d desired uid=%d

Description:

A server (rpc.pmfd, rpc.fed or rgmd) refused an rpc connection from a client because it has the wrong uid. The actual and desired uids are shown. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


142779 Unable to open failfast device

Description:

A server (rpc.pmfd or rpc.fed) was not able to establish a link to the failfast device, which ensures that the host aborts if the server dies. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


142889 Starting up saposcol process under PMF times out.

Description:

Staring up the SAP OS collector process under the control of Process Monitor facility times out. This might happen under heavy system load.

Solution:

You might consider increase the start timeout value.


143694 lkcm_act: caller is already registered

Description:

Message indicating that udlm is already registered with ucmm.

Solution:

This is an internal error. Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.


144303 fatal: uname: %s (UNIX error %d)

Description:

A uname(2) system call failed. The rgmd will produce a core file and will force the node to halt or reboot to avoid the possibility of data corruption.

Solution:

Save a copy of the /var/adm/messages files on all nodes, and of the rgmd core file. Contact your authorized Sun service provider for assistance in diagnosing the problem.


145270 Cannot determine if the server is secure: assuming non-secure.

Description:

While parsing the Netscape configuration file to determine if the Netscape server is running under secure or non-secure mode an error occured. This error results in the Data Service assuming a non-secure Netscape server, and will probe the server as such.

Solution:

Check the Netscape configuration file to make sure that it exists and that it contains information about whether the server is running as a secure server or not.


145770 CMM: Monitoring disabled.

Description:

Transport path monitoring has been disabled in the cluster. It is enabled by default.

Solution:

This is an informational message, no user action is needed.


145800 Validation failed. ORACLE_HOME/bin/sqlplus not found ORACLE_HOME=%s

Description:

Oracle binaries (sqlplus) not found in ORACLE_HOME/bin directory. ORACLE_HOME specified for the resource is indicated in the message. HA-Oracle will not be able to manage resource if ORACLE_HOME is incorrect.

Solution:

Specify correct ORACLE_HOME when creating resource. If resource is already created, please update resource property 'ORACLE_HOME'.


145893 CMM: Unable to read quorum information. Error = %d.

Description:

The specified error was encountered while trying to read the quorum information from the CCR. This is probably because the CCR tables were modified by hand, which is an unsupported operation. The node will panic.

Solution:

Reboot the node in non-cluster (-x) mode and restore the CCR tables from the other nodes in the cluster or from backup. Reboot the node back in cluster mode. The problem should not reappear.


146238 CMM: Halting to prevent split brain with node %ld.

Description:

Due to a connection failure with the specified node, the CMM is failing this node to prevent split brain partial connectivity.

Solution:

Any interconnect failure should be resolved, and/or the failed node rebooted.


146961 Signal %d terminated the child process.

Description:

An unexpected signal caused the termination of the program that checks the availability of name service.

Solution:

Save a copy of the /var/adm/messages files on all nodes. If a core file was generated, submit the core to your service provider. Contact your authorized Sun service provider for assistance in diagnosing the problem.


147230 Invalid resource settings.


147516 sigprocmask: %s

Description:

The rpc.pmfd server was not able to set its signal mask. The message contains the system error. This happens while the server is starting up, at boot time. The server does not come up, and an error message is output to syslog.

Solution:

Save the syslog messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


148023 method <%s> completed successfully for resource <%s>, resource group <%s>

Description:

RGM invoked a callback method for the named resource, as a result of a cluster reconfiguration, scha_control GIVEOVER, or scswitch. The method completed successfully.

Solution:

This is an informational message, no user action is needed.


148393 Unable to create thread. Exiting.\n

Description:

clexecd program has encountered a failed thr_create(2) system call. The error message indicates the error number for the failure.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


148465 Prog <%s> step <%s>: RPC connection error.

Description:

An attempted program execution failed, due to an RPC connection problem. This failure is considered a program failure.

Solution:

Examine other syslog messages occurring around the same time on the same node, to see if the cause of the problem can be identified. If the same error recurs, you might have to reboot the affected node.


148526 fatal: Cannot get local nodename

Description:

An internal error has occurred. The rgmd will produce a core file and will force the node to halt or reboot to avoid the possibility of data corruption.

Solution:

Save a copy of the /var/adm/messages files on all nodes, and of the rgmd core file. Contact your authorized Sun service provider for assistance in diagnosing the problem.


148821 Error in trying to access the configured network resources : %s.

Description:

Failed to get the available network address resources for this resource.

Solution:

This is an internal error. Save the /var/adm/messages file and contact an authorized Sun service provider.


148902 No node was specified as part of property %s for element %s. The property must be specified as %s=Weight%cNode,Weight%cNode,...

Description:

The property was specified incorrectly.

Solution:

Set the property using the correct syntax.


149184 clcomm: inbound_invo::signal:_state is 0x%x

Description:

The internal state describing the server side of a remote invocation is invalid when a signal arrives during processing of the remote invocation.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


150105 This list element in System property %s has an invalid IP address (hostname): %s.

Description:

The system property that was named does not have a valid hostname or dotted-decimal IP address string.

Solution:

Change the value of the property to use a valid hostname or dotted-decimal IP address string.


150535 clcomm: Could not find %s(): %s

Description:

The function get_libc_func could not find the specified function for the reason specified. Refer to the man pages for "dlsym" and "dlerror" for more information.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


150628 sigaddset: %s

Description:

The rpc.pmfd server was not able to add a signal to a signal set. The message contains the system error. This happens while the server is starting up, at boot time. The server does not come up, and an error message is output to syslog.

Solution:

Save the syslog messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


152159 WARNING: lkcm_sync: udlm_send_reply failed, forcing reconfiguration

Description:

A reconfiguration will start.

Solution:

None.


152478 Monitor_retry_count or Monitor_retry_interval is not set.

Description:

The resource properties Monitor_retry_count or Monitor_retry_interval has not set. These properties control the restarts of the fault monitor.

Solution:

Check whether the properties are set. If not, set these values using scrgadm(1M).


152546 ucm_callback for stop_trans generated exception %d

Description:

ucmm callback for stop transition failed.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


153018 WARNING: missing msg, expected: dont_care, %d, %d, but received: %d %d, %d. FORCING reconfiguration.

Description:

Unexpected message received by udlm. This will trigger an OPS reconfiguration.

Solution:

None.


153025 Failed to unplumb %s from %s.


154317 launch_validate: fe_method_full_name() failed for resource <%s>, resource group <%s>, method <%s>

Description:

Due to an internal error, the rgmd was unable to assemble the full method pathname for the VALIDATE method. This is considered a VALIDATE method failure. This in turn will cause the failure of a creation or update operation on a resource or resource group.

Solution:

Examine other syslog messages occurring at about the same time to see if the problem can be identified. Re-try the creation or update operation. If the problem recurs, save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance.


155479 ERROR: VALIDATE method timeout property of resource <%s> is not an integer

Description:

The indicated resource's VALIDATE method timeout, as stored in the CCR, is not an integer value. This might indicate corruption of CCR data or rgmd in-memory state; the VALIDATE method invocation will fail. This in turn will cause the failure of a creation or update operation on a resource or resource group.

Solution:

Use scrgadm(1M) -pvv to examine resource properties. If the VALIDATE method timeout or other property values appear corrupted, the CCR might have to be rebuilt. If values appear correct, this may indicate an internal error in the rgmd. Re-try the creation or update operation. If the problem recurs, save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance.


156178 Encountered an error while starting device services.

Description:

An error was detected by the HAStoragePlus resource's start method. This error was encountered during attempts to start a device service on a given node. Startup of the device services occur when the HAStoragePlus resource is brought online on a node the first time or after the resource is switched/failed over to another node. It is highly likely that a DCS function call returned an error.

Solution:

Examine the GlobalDevicePaths and FilesystemMountpoint extension properties for any invalid specifications. Examine the status of DCS. Contact your authorized Sun service provider for assistance in diagnosing the problem.


156527 Unable to execute <%s>: <%s>.

Description:

Sun Cluster was unable to execute a command.

Solution:

The problem could be caused by: 1) No more process table entries for a fork() 2) No available memory For the above two causes, the only option is to reboot the node. The problem might also be caused by: 3) The command that could not execute is not correctly installed For the above cause, the command might have the wrong path or file permissions. Correctly install the command.


157213 CCR: The repository on the joining node %s could not be recovered, join aborted.

Description:

The indicated node failed to update its repository with the ones in current membership. And it will not be able to join the current membership.

Solution:

There may be other related messages on the indicated node, which help diagnose the problem, for example: If the root disk failed, it needs to be replaced. If the root disk is full, remove some unnecessary files to free up some space.


158471 Share command %s did not complete successfully.


158530 CMM: Halting because this node is severely short of resident physical memory; availrmem = %ld pages, tune.t_minarmem = %ld pages.

Description:

The local node does not have sufficient resident physical memory due to which it may declare other nodes down. To prevent this action, the local node is going to halt.

Solution:

There may be other related messages that may indicate the cause for the node having reached the low memory state. Resolve the problem and reboot the node. If unable to resolve the problem, contact your authorized Sun service provider to determine whether a workaround or patch is available


158836 Endpoint %s initialization error - errno = %d, failing associated pathend.

Description:

Communication with another node could not be established over the path.

Solution:

Any interconnect failure should be resolved, and/or the failed node rebooted.


158981 Path <%s> is not a valid file system mount point specified in /etc/vfstab

Description:

The "ServicePaths" property of the hastorage resource should be valid disk group or device special file or global file system mount point specified in the /etc/vfstab file.

Solution:

Check the definition of the extension property "ServicePaths" of SUNW.HAStorage type resource. If they are file system mount points, verify that the /etc/vfstab file contains correct entries.


159059 IP address (hostname) %s from %s at entry %d in list property %s does not belong to any network resource used by resource %s.

Description:

The hostname or dotted-decimal IP address string in the message does not resolve to an IP address equal to any resolved IP address from the named resource's Network_resources_used property. Any explicitly named hostname or dotted-decimal IP address string in the named list property must resolve to an IP address equal to a resolved IP address from Network_resources_used.

Solution:

Either modify the hostname or dotted-decimal IP address string from the entry in the named property or modify Network_resources_used so that the entry resolves to an IP address equal to a resolved IP address from Network_resources_used.


159501 host %s failed: %s

Description:

The rgm is not able to establish an rpc connection to the rpc.fed server on the host shown, and the error message is shown. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


159592 clcomm: Cannot make high %d less than current total %d

Description:

An attempt was made to change the flow control policy parameter specifying the high number of server threads for a resource pool. The system does not allow the high number to be reduced below current total number of server threads.

Solution:

No user action required.


160167 Server successfully started.

Description:

Informational message. Oracle server has been successfully started by HA-Oracle.

Solution:

None


160400 fatal: fcntl(F_SETFD): %s (UNIX error %d)

Description:

This error should not occur. The rgmd will produce a core file and will force the node to halt or reboot to avoid the possibility of data corruption.

Solution:

Save a copy of the /var/adm/messages files on all nodes, and of the rgmd core file. Contact your authorized Sun service provider for assistance in diagnosing the problem.


160619 Could not enlarge buffer for DBMS log messages: %m

Description:

Fault monitor could not allocate memory for reading RDBMS log file. As a result of this error, fault monitor will not scan errors from log file. However it will continue fault monitoring.

Solution:

Check if system is low on memory. If problem persists, please stop and start the fault monitor.


161104 Adaptive server stopped.

Description:

The Adaptive server has been shutdown by Sun Cluster HA for Sybase.

Solution:

This is an information message, no user action is needed.


161275 reservation fatal error(UNKNOWN) - Illegal command line option

Description:

The device fencing program has suffered an internal error.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available. Copies of /var/adm/messages from all nodes should be provided for diagnosis. It may be possible to retry the failed operation, depending on the nature of the error. If the message specifies the 'node_join' transition, then this node may be unable to access shared devices. If the failure occurred during the 'release_shared_scsi2' transition, then a node which was joining the cluster may be unable to access shared devices. In either case, it may be possible to reacquire access to shared devices by executing '/usr/cluster/lib/sc/run_reserve -c node_join' on all cluster nodes. If the failure occurred during the 'make_primary' transition, then a device group has failed to start on this node. If another node was available to host the device group, then it should have been started on that node. If desired, it may be possible to switch the device group to this node with the scswitch command. If no other node was available, then the device group will not have been started. The scswitch command may be used to retry the attempt to start the device group. If the failure occurred during the 'primary_to_secondary' transition, then the shutdown or switchover of a device group has failed. The desired action may be retried.


161683 %s/%s/install/startserver does not have executepermissions set.

Description:

The Sybase Adaptive Server is started by execution of the'startserver' file. The file's current permissions prevent itsexecution. The full path name of the 'startserver' file is specified as a part of this error message. This file is locatedin the $SYBASE/$ASE/install directory

Solution:

Verify the permissions of the 'startserver' file and ensure thatit can be executed. If not, use chmod to modify its execute permissions.


161934 pid %d is stopped.

Description:

HA-NFS fault monitor has detected that the specified process has been stopped with a signal.

Solution:

No action. HA-NFS fault monitor would kill and restart the stopped process.


161991 Load balancer for group '%s' setting weight for node %s to %d

Description:

This message indicates that the user has set a new weight for a particular node from an old value.

Solution:

This is an informational message, no user action is needed.


162419 ERROR: launch_method: cannot get Failover_mode for resource <%s>, assuming NONE.

Description:

A method execution has failed or timed out. For some reason, the rgmd is unable to obtain the Failover_mode property of the resource. The rgmd assumes a setting of NONE for this property, therefore avoiding the outcome of rebooting the node (for STOP method failure) or failing over the resource group (for START method failure). For these cases, the resource is placed into a STOP_FAILED or START_FAILED state, respectively.

Solution:

Save a copy of the /var/adm/messages files on all nodes, and contact your authorized Sun service provider for assistance in diagnosing the problem.


162502 tag %s: %s

Description:

The tag specified that is being run under the rpc.fed produced the specified message.

Solution:

This message is for informational purposes only. No user action is necessary.


162505 Could not start Siebel server: %s.

Description:

Siebel server could not start because a service it depends on is not running.

Solution:

Make sure that the Siebel database and the Siebel gateway are running before attempting to restart the Siebel server resource.


162851 Unable to lookup nfs:nfs_server:calls from kstat.

Description:

See 176151

Solution:

See 176151


163027 CMM: Quorum device %s: owner set to node %ld.

Description:

The specified node has taken ownership of the specified quorum device.

Solution:

This is an informational message, no user action is needed.


164164 Starting Sybase %s: %s. Startup file: %s

Description:

Sybase server is going to be started by Sun Cluster HA for Sybase.

Solution:

This is an information message, no user action is needed.


164757 reservation fatal error(%s) - realloc() error, errno %d

Description:

The device fencing program has been unable to allocate required memory.

Solution:

Memory usage should be monitored on this node and steps taken to provide more available memory if problems persist. Once memory has been made available, the following steps may need to taken: If the message specifies the 'node_join' transition, then this node may be unable to access shared devices. If the failure occurred during the 'release_shared_scsi2' transition, then a node which was joining the cluster may be unable to access shared devices. In either case, access to shared devices can be reacquired by executing '/usr/cluster/lib/sc/run_reserve -c node_join' on all cluster nodes. If the failure occurred during the 'make_primary' transition, then a device group has failed to start on this node. If another node was available to host the device group, then it should have been started on that node. The device group can be switched back to this node if desired by using the scswitch command. If no other node was available, then the device group will not have been started. The scswitch command may be used to start the device group. If the failure occurred during the 'primary_to_secondary' transition, then the shutdown or switchover of a device group has failed. The desired action may be retried.


165512 reservation error(%s) - my_map_to_did_device() error in other_node_status()

Description:

The device fencing program has suffered an internal error.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available. Copies of /var/adm/messages from all nodes should be provided for diagnosis. It may be possible to retry the failed operation, depending on the nature of the error. If the message specifies the 'node_join' transition, then this node may be unable to access shared devices. If the failure occurred during the 'release_shared_scsi2' transition, then a node which was joining the cluster may be unable to access shared devices. In either case, it may be possible to reacquire access to shared devices by executing '/usr/cluster/lib/sc/run_reserve -c node_join' on all cluster nodes. If the failure occurred during the 'make_primary' transition, then a device group has failed to start on this node. If another node was available to host the device group, then it should have been started on that node. If desired, it may be possible to switch the device group to this node with the scswitch command. If no other node was available, then the device group will not have been started. The scswitch command may be used to retry the attempt to start the device group. If the failure occurred during the 'primary_to_secondary' transition, then the shutdown or switchover of a device group has failed. The desired action may be retried.


165527 Oracle UDLM package is not properly installed. %s not found.

Description:

Oracle udlm package installation problem.

Solution:

Make sure Oracle UDLM package is properly installed.


165731 Backup server successfully started.

Description:

The Sybase backup server has been successfully started by Sun ClusterHA for Sysbase.

Solution:

This is an information message, no user action is needed.


166235 Unable to open door %s: %s

Description:

Solution:


166362 clexecd: Got back %d from I_RECVFD. Looks like parent is dead.

Description:

Parent process in the clexecd program is dead.

Solution:

If the node is shutting down, ignore the message. If not, the node on which this message is seen, will shutdown to prevent to prevent data corruption. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


166489 reservation error(%s) error. Node %d is not in the cluster

Description:

A node which the device fencing program was communicating with has left the cluster.

Solution:

This is an informational message, no user action is needed.


166560 Maximum Primaries is %d. It should be 1.

Description:

Invalid value has set for Maximum Primaries. The value should be 1.

Solution:

Reset this value using scrgadm(1M).


166590 NULL value returned for the extension property <%s>.

Description:

The extension property <%s> is set to NULL in the RTR File.

Solution:

Serious error, the RTR file is corrupted. Reload the package for HA-NetBackup SUNWscnb. If problem persists contact the Sun Cluster HA developer.


167108 Starting Oracle server.

Description:

Informational message. Oracle server is being started by HA-Oracle.

Solution:

None


167253 Server stopped successfully.

Description:

Informational message. Oracle server successfully stopped.

Solution:

None


167824 %s has been deleted.\nIf %s was hosting any HA IP addresses then these should be re-registered.

Description:

We do not allow deleting of an IPMP group which is hosting Logical IP addresses registered by RGM. Therefore we notify the user of the possible error.

Solution:

This message is informational; no user action is needed.


168150 INTERNAL ERROR CMM: Cannot bind quorum algorithm object to local name server.

Description:

There was an error while binding the quorum subsystem object to the local name server.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


168383 Service not started

Description:

There was a problem detected in the initial startup of the service.

Solution:

Attempt to start the service by hand to see if there are any apparent problems with the application. Correct these problems and attempt to start the data service again.


168630 could not read cluster name

Description:

Could not get cluster name. Perhaps the system is not booted as part of the cluster.

Solution:

Make sure the node is booted as part of a cluster.


168917 %s: Not able to get the private network address.

Description:

The daemon is unable to get private net address. Cluster is configured incorrectly on the machine where message is logged.


168970 sun_udlm_read_oracle_cfg: open failed: %s ... will use default values

Description:

Could not read parameter values from config file. Will use default values instead.

Solution:

None.


169308 Database might be down, HA-SAP won't take any action. Will check again in %d seconds.

Description:

Database connection check failed indicating the database might be down. HA-SAP will not take any action, but will check the database connection again after the time specified.

Solution:

Make sure the database and the HA software for the database are functioning properly.


169606 Unable to create thread. Exiting.

Description:

clexecd program has encountered a failed thr_create(2) system call. The error message indicates the error number for the failure.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


169608 INTERNAL ERROR: scha_control_action: invalid action <%d>

Description:

The scha_control function has encountered an internal logic error. This will cause scha_control to fail with a SCHA_ERR_INTERNAL error, thereby preventing a resource-initiated failover.

Solution:

Please save a copy of the /var/adm/messages files on all nodes, and report the problem to your authorized Sun service provider.


169765 Configuration file not found.

Description:

Internal error. Configuration file for online_check not found.

Solution:

Please report this problem.


171031 reservation fatal error(%s) - get_control() failure

Description:

The device fencing program has suffered an internal error.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available. Copies of /var/adm/messages from all nodes should be provided for diagnosis. It may be possible to retry the failed operation, depending on the nature of the error. If the message specifies the 'node_join' transition, then this node may be unable to access shared devices. If the failure occurred during the 'release_shared_scsi2' transition, then a node which was joining the cluster may be unable to access shared devices. In either case, it may be possible to reacquire access to shared devices by executing '/usr/cluster/lib/sc/run_reserve -c node_join' on all cluster nodes. If the failure occurred during the 'make_primary' transition, then a device group has failed to start on this node. If another node was available to host the device group, then it should have been started on that node. If desired, it may be possible to switch the device group to this node with the scswitch command. If no other node was available, then the device group will not have been started. The scswitch command may be used to retry the attempt to start the device group. If the failure occurred during the 'primary_to_secondary' transition, then the shutdown or switchover of a device group has failed. The desired action may be retried.


171786 listener %s is not running. Attempting restart.

Description:

Listen monitor has detected failure of listener. Monitor will attempt to restart the listener.

Solution:

None


171878 in libsecurity setnetconfig failed when initializing the client: %s - %s

Description:

A client was not able to make an rpc connection to a server (rpc.pmfd, rpc.fed or rgmd) because it could not establish a rpc connection for the network specified. The rpc error and the system error are shown. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


172566 Stopping oracle server using shutdown abort

Description:

Informational message. Oracle server will be stopped using 'shutdown abort' command.

Solution:

Examine 'Stop_timeout' property of the resource and increase 'Stop_timeout' if you don't wish to use 'shutdown abort' for stopping Oracle server.


172817 Cluster nodes %u and %u have incompatible versions for the %s protocol.

Description:

This is an informational message from the cluster version manager and may help diagnose what software component is failing to find a compatible version during a rolling upgrade. This error may also be due to attempting to boot a cluster node in 64-bit address mode when other nodes are booted in 32-bit address mode, or vice versa.

Solution:

This message is informational; no user action is needed. However, if this message is for a core component, one or more nodes may shut down in order to preserve system integrity. Verify that any recent software installations completed without errors and that the installed packages or patches are compatible with the rest of the installed software.


173313 Unable to restart NFS daemons.


173733 Failed to retrieve the resource type property %s for %s: %s.

Description:

The query for a property failed. The reason for the failure is given in the message.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


174078 Adaptive server shutdown with nowait failed. STOP_FILE %s.

Description:

The Sybase adaptive server failed to shutdown with the nowait option using the file specified in the STOP_FILE property.

Solution:

No user action is needed. Other syslog messages, the log file of Sun Cluster HA for Sybase or the adaptive server log file may provide additional information on possible reasons for the failure.


174751 Failed to retrieve the process monitor facility tag.

Description:

Failed to create the tag that has used to register with the process monitor facility.

Solution:

Check the syslog messages that occurred just before this message. In case of internal error, save the /var/adm/messages file and contact authorized Sun service provider.


174909 Failed to open the resource handle: %s.

Description:

An API operation has failed while retrieving the resource property. Low memory or API call failure might be the reasons.

Solution:

In case of low memory, the problem will probably cured by rebooting. If the problem reoccurs, you might need to increase swap space by configuring additional swap devices. Otherwise, if it is API call failure, check the syslog messages from other components. For resource name and the property name, check the current syslog message.


174928 ERROR: process_resource: resource <%s> is offline pending boot, but no BOOT method is registered

Description:

A non-fatal internal error has occurred in the rgmd state machine.

Solution:

Since this problem might indicate an internal logic error in the rgmd, please save a copy of the /var/adm/messages files on all nodes, the output of an scstat -g command, and the output of a scrgadm -pvv command. Report the problem to your authorized Sun service provider.


175370 svc_restore_priority: Could not restore original scheduling parameters: %s

Description:

The server was not able to restore the original scheduling mode. The system error message is shown. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


175461 Failed to open resource %s: %s.

Description:

The PMF action script supplied by the DSDL could not retrieve information about the given resource.

Solution:

Check the syslog messages around the time of the error for messages indicating the cause of the failure. If this error persists, contact your authorized Sun service provider for assistance in diagnosing and correcting the problem.


175553 clconf: Your configuration file is incorrect! The type of property %s is not found

Description:

Could not find the type of property in the configuration file.

Solution:

Check the configuration file.


176151 Unable to lookup nfs:nfs_server from kstat:%s

Description:

HA-NFS fault monitor failed to lookup the specified kstat parameter. The specific cause is logged with the message.

Solution:

Run the following command on the cluster node where this problem is encounterd. /usr/bin/kstat -m nfs -i 0 -n nfs_server -s calls Barring resource availability issues, this call should complete successfully. If it fails without generating any output, please contact your authorized sun service provider.


176860 Error: Unable to update scha_control timestamp file <%s> for resource <%s>

Description:

The rgmd failed in a call to utime(2) on the local node. This may prevent the anti-"pingpong" feature from working, which may permit a resource group to fail over repeatedly between two or more nodes. The failure of the utime call might indicate a more serious problem on the node.

Solution:

Examine other syslog messages occurring around the same time on the same node, to see if the source of the problem can be identified.


176974 Validation failed. SYBASE environment variable is not set in Environment_file.

Description:

SYBASE environment variable is not set in environment_file or is empty string.

Solution:

Check the the file specified in Environment_file property. Check the value of SYBASE environment variable, specified in the Environment_file. SYBASE environment variable should be set to the directory of Sybase ASE installation.


177070 Got back %d in revents of the control fd. Exiting.

Description:

clexecd program has encountered an error.

Solution:

The clexecd program will exit and the node will be halted or rebooted to prevent data corruption. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


177252 reservation warning(%s) - MHIOCGRP_INRESV error will retry in %d seconds

Description:

The device fencing program has encountered errors while trying to access a device. The failed operation will be retried

Solution:

This is an informational message, no user action is needed.


177878 Can't access kernel timeout facility

Description:

Failed to maintain timeout state for client affinity on the node.

Solution:

If client affinity is a requirement for some of the sticky services, say due to data integrity reasons, the node should be restarted.


177899 t_bind (open_cmd_port) failed

Description:

Call to t_bind() failed. The "t_bind" man page describes possible error codes. ucmmd will exit and the node will abort.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


179364 CCR: Invalid CCR metadata.

Description:

The CCR could not find valid metadata on all nodes of the cluster.

Solution:

Boot the cluster in -x mode to restore the cluster repository on all the nodes in the cluster from backup. The cluster repository is located at /etc/cluster/ccr/.


179953 Failed to parse xml: too many events

Description:

Solution:


180002 Failed to stop the monitor server using %s.

Description:

Sun Cluster HA for Sybase failed to stop the backup server using the file specified in the STOP_FILE property. Other syslog messages and the log file will provide additional information on possible reasons for the failure. It is likely that adaptive server terminated prior to shutdown of monitor server.

Solution:

Please check the permissions of file specified in the STOP_FILE extension property. File should be executable by the Sybase owner and root user.


181193 Cannot access file <%s>, err = <%s>

Description:

The rgmd has failed in an attempt to stat(2) a file used for the anti-"pingpong" feature. This may prevent the anti-pingpong feature from working, which may permit a resource group to fail over repeatedly between two or more nodes. The failure to access the file might indicate a more serious problem on the node.

Solution:

Examine other syslog messages occurring around the same time on the same node, to see if the source of the problem can be identified.


182725 /etc/hostname.%s has the keyword group already

Description:

This means that auto-create was called even though the /etc/hostname.adp file has the keyword "group". Someone might have hand edited that file. This could also happen if someone deletes an IPMP group - A notification should have been provided for this.

Solution:

Please change the file back to its original contents. Try the scrgadm command again. We do not allow IPMP groups to be deleted once they are configured. If the problem persists contact your authorized Sun service provider for suggestions.


183071 Cannot Execute %s: %s.

Description:

Failure in executing the command.

Solution:

Check the syslog message for the command description. Check whether the system is low in memory or the process table is full and take appropriate action. Make sure that the executable exists.


183799 clconf: CSR not initialized

Description:

While executing task in clconf and modifying the state of proxy, found component CSR not initialized.

Solution:

Check the CSR component in the configuration file.


183934 Waiting for %s to come up.

Description:

The specific service or process is not yet up.

Solution:

This is an informative message. Suitable action may be taken if the specified service or process does not come up within a configured time limit.


184139 scvxvmlg warning - found no match for %s, removing it

Description:

The program responsible for maintaining the VxVM device namespace has discovered inconsistencies between the VxVM device namespace on this node and the VxVM configuration information stored in the cluster device configuration system. If configuration changes were made recently, then this message should reflect one of the configuration changes. If no changes were made recently or if this message does not correctly reflect a change that has been made, the VxVM device namespace on this node may be in an inconsistent state. VxVM volumes may be inaccessible from this node.

Solution:

If this message correctly reflects a configuration change to VxVM diskgroups then no action is required. If the change this message reflects is not correct, then the information stored in the device configuration system for each VxVM diskgroup should be examined for correctness. If the information in the device configuration system is accurate, then executing '/usr/cluster/lib/dcs/scvxvmlg' on this node should restore the device namespace. If the information stored in the device configuration system is not accurate, it must be updated by executing '/usr/cluster/bin/scconf -c -D name=diskgroup_name' for each VxVM diskgroup with inconsistent information.


185089 CCR: Updating table %s failed to startup on node %s.

Description:

The operation to update the indicated table failed to start on the indicated node.

Solution:

There may be other related messages on the nodes where the failure occurred, which may help diagnose the problem. If the root disk failed, it needs to be replaced. If the indicated table was deleted by accident, boot the offending node(s) in -x mode to restore the indicated table from other nodes in the cluster. The CCR tables are located at /etc/cluster/ccr/. If the root disk is full, remove some unnecessary files to free up some space.


185191 MAC addresses are not unique per subnet

Description:

What this means is that there are at least two adapters on a subnet which have the same MAC address. IPMP makes the assumption that all adapters have unique MAC addresses.

Solution:

Look at the ifconfig man page on how to set MAC addresses manually. This is however, a temporary fix and the real fix is to upgrade the hardware so that the adapters have unique MAC addresses.


185465 No action on DBMS Error %s: %ld

Description:

Database server returned error. Fault monitor does not take any action on this error.

Solution:

No action required.


185713 Unable to start lockd.

Description:

HA-NFS was not able to start lockd.

Solution:

This is an informative message. HA-NFS will try and restart lockd.


185720 lkdb_parm: lib initialization failed

Description:

initializing a library to get the static lock manager parameters failed.

Solution:

This is an internal error. Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.


185839 IP address (hostname) and Port pairs %s%c%d and %s%c%d in property %s, at entries %d and %d, effectively duplicate each other. The port numbers are the same and the resolved IP addresses are the same.

Description:

The two list entries at the named locations in the named property have port numbers that are identical, and also have IP address (hostname) strings that resolve to the same underlying IP address. An IP address (hostname) string and port entry should only appear once in the property.

Solution:

Specify the property with only one occurrence of the IP address (hostname) string and port entry.


185974 Default Oracle paramter file %s does not exist

Description:

Oracle Parameter file has not been specified. Default parameter file indicated in the message does not exist.

Solution:

Please make sure that parameter file exists at the location indicated in message or specify 'Parameter_file' property for the resource.


186306 Conversion of hostnames failed for %s.

Description:

The hostname or IP address given could not be converted to an integer.

Solution:

Add the hostname to the /etc/inet/hosts file. Verify the settings in the /etc/nsswitch.conf file include "files" for host lookup.


186484 PENDING_METHODS: bad resource state <%s> (%d) for resource <%s>

Description:

The rgmd state machine has discovered a resource in an unexpected state on the local node. This should not occur and may indicate an internal logic error in the rgmd.

Solution:

Look for other syslog error messages on the same node. Save a copy of the /var/adm/messages files on all nodes, and report the problem to your authorized Sun service provider.


186524 reservation error(%s) - do_scsi2_release() error for disk %s

Description:

The device fencing program has encountered errors while trying to access a device. All retry attempts have failed.

Solution:

The action which failed is a scsi-2 ioctl. These can fail if there are scsi-3 keys on the disk. To remove invalid scsi-3 keys from a device, use 'scdidadm -R' to repair the disk (see scdidadm man page for details). If there were no scsi-3 keys present on the device, then this error is indicative of a hardware problem, which should be resolved as soon as possible. Once the problem has been resolved, the following actions may be necessary: If the message specifies the 'node_join' transition, then this node may be unable to access the specified device. If the failure occurred during the 'release_shared_scsi2' transition, then a node which was joining the cluster may be unable to access the device. In either case, access can be reacquired by executing '/usr/cluster/lib/sc/run_reserve -c node_join' on all cluster nodes. If the failure occurred during the 'make_primary' transition, then a device group may have failed to start on this node. If the device group was started on another node, it may be moved to this node with the scswitch command. If the device group was not started, it may be started with the scswitch command. If the failure occurred during the 'primary_to_secondary' transition, then the shutdown or switchover of a device group may have failed. If so, the desired action may be retried.


186612 _cladm CL_GET_CLUSTER_NAME failed; perhaps system is not booted as part of cluster

Description:

Could not get cluster name. Perhaps the system is not booted as part of the cluster.

Solution:

Make sure the node is booted as part of a cluster.


187307 invalid debug_level: '%s'

Description:

Invalid debug_level argument passed to udlmctl. udlmctl will not startup.

Solution:

This is an internal error. Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.


187679 NULL hostname passed to validate local host function.


190918 Failed to start orbixd.

Description:

The orbix daemon couldnt not be started.

Solution:

Check if orbix daemon could be started manuallyas the Broadvision user. If it can be started but couldntbe started under HA then,save a copy of the /var/adm/messages files on all nodes. Save a copy of the orbixd log files which are located in /var/run/cluster/bv/ and contact your authorized Sun service provider.


191225 clcomm: Created %d threads, wanted %d for pool %d

Description:

The system creates server threads to support requests from other nodes in the cluster. The system could not create the desired minimum number of server threads. However, the system did succeed in creating at least 1 server thread. The system will have further opportunities to create more server threads. The system cannot create server threads when there is inadequate memory. This message indicates either inadequate memory or an incorrect configuration.

Solution:

There are multiple possible root causes. If the system administrator specified the value of "maxusers", try reducing the value of "maxusers". This reduces memory usage and results in the creation of fewer server threads. If the system administrator specified the value of "cl_comm:min_threads_default_pool" in "/etc/system", try reducing this value. This directly reduces the number of server threads. Alternatively, do not specify this value. The system can automatically select an appropriate number of server threads. Another alternative is to install more memory. If the system administrator did not modify either "maxusers" or "min_threads_default_pool", then the system should have selected an appropriate number of server threads. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


191270 IP address (hostname) string %s in property %s, entry %d does not resolve to an IP address that belongs to one of the resources named in property %s.

Description:

The IP address or hostname named does not belong to one of the network resources designated for use by this resource

Solution:

Either select a different IP address to use that is in one of the network resources used by this resource or create a network resource that contains the named IP address and designate that resource as one of the network resources used by this resource.


191409 scvxvmlg warning - chown(%s) failed

Description:

The program responsible for maintaining the VxVM namespace was unable to access the global device namespace. If configuration changes were recently made to VxVM diskgroups or volumes, this node may be unaware of those changes. Recently created volumes may be unaccessible from this node.

Solution:

Verify that the /global/.devices/node@N (N = this node's node number) is mounted globally and is accessible. If no configuration changes have been recently made to VxVM diskgroups or volumes and all volumes continue to be accessible from this node, then no further action is required. If changes have been made, the device namespace on this node can be updated to reflect those changes by executing '/usr/cluster/lib/dcs/scvxvmlg'. If the problem persists, contact your authorized Sun service provider to determine whether a workaround or patch is available.


191492 CCR: CCR unable to read root file system.

Description:

The CCR failed to read repository due to root file system failure on this node.

Solution:

The root file system needs to be replaced on the offending node.


191506 ERROR: enabled resource <%s> in resource group <%s> depends on disabled resource <%s>

Description:

An enabled resource was found to depend on a disabled resource. This should not occur and may indicate an internal logic error in the rgmd.

Solution:

Look for other syslog error messages on the same node. Save a copy of the /var/adm/messages files on all nodes, and report the problem to your authorized Sun service provider.


191772 Failed to configure the networking components for scalable resource %s for method %s.

Description:

The processing that is required for scalable services did not complete successfully.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


191957 The property %s does not have a legal value.

Description:

The property named does not have a legal value.

Solution:

Assign the property a legal value.


192183 freeze_adjust_timeouts: call to rpc.fed failed, tag <%s> err <%d> result <%d>

Description:

The rgmd failed in its attempt to suspend timeouts on an executing method during temporary unavailability of a global device group. This could cause the resource method to time-out. Depending on which method was being invoked and the Failover_mode setting on the resource, this might cause the resource group to fail over or move to an error state.

Solution:

No action is required if the resource method execution succeeds. If the problem recurs, rebooting this node might cure it. Save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance in diagnosing the problem.


192518 Cannot access start script %s: %s

Description:

The start script is not accessible and executable. This may be due to the script not existing or the permissions not being set properly.

Solution:

Make sure the script exists, is in the proper directory, and has read nd execute permissions set appropriately.


192619 reservation error(%s) - Unable to open device %s

Description:

The device fencing program has encountered errors while trying to access a device. All retry attempts have failed.

Solution:

This may be indicative of a hardware problem, which should be resolved as soon as possible. Once the problem has been resolved, the following actions may be necessary: If the message specifies the 'node_join' transition, then this node may be unable to access the specified device. If the failure occurred during the 'release_shared_scsi2' transition, then a node which was joining the cluster may be unable to access the device. In either case, access can be reacquired by executing '/usr/cluster/lib/sc/run_reserve -c node_join' on all cluster nodes. If the failure occurred during the 'make_primary' transition, then a device group may have failed to start on this node. If the device group was started on another node, it may be moved to this node with the scswitch command. If the device group was not started, it may be started with the scswitch command. If the failure occurred during the 'primary_to_secondary' transition, then the shutdown or switchover of a device group may have failed. If so, the desired action may be retried.


193137 Service group '%s' deleted

Description:

The service group by that name is no longer known by the scalable services framework.

Solution:

This is an informational message, no user action is needed.


193263 Service is online.

Description:

While attempting to check the health of the data service, probe detected that the resource status is fine and it is online.

Solution:

This is informational message. No user action is needed.


193933 CMM: Votecount changed from %d to %d for node %s.

Description:

The specified node's votecount has been changed as indicated.

Solution:

This is an informational message, no user action is needed.


194179 Failed to stop the service %s.

Description:

Specified data service failed to stop.

Solution:

Look in /var/adm/messages for the cause of failure. Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


194512 Failed to stop HA-NFS system fault monitor.

Description:

Process monitor facility has failed to stop the HA-NFS system fault monitor.

Solution:

Use pmfadm(1M) with -s option to stop the HA-NFS system fault monitor with tag name "cluster.nfs.daemons". If the error still persists, then reboot the node.


194810 clcomm: thread_create failed for resource_thread

Description:

The system could not create the needed thread, because there is inadequate memory.

Solution:

There are two possible solutions. Install more memory. Alternatively, reduce memory usage. Since this happens during system startup, application memory usage is normally not a factor.


195286 CMM: Placing reservation on quorum device %s failed with error %d.

Description:

The specified error was encountered while trying to place a reservation on the specified quorum device, hence this node can not take ownership of this quorum device.

Solution:

There may be other related messages on this and other nodes connected to this quorum device that may indicate the cause of this problem. Refer to the quorum disk repair section of the administration guide for resolving this problem.


195538 Null value is passed for the handle.

Description:

A null handle was passed for the function parameter. No further processing can be done without a proper handle.

Solution:

It's a programming error, core is generated. Specify a non-null handle in the function call.


195565 Configuration file <%s> does not configure %s.

Description:

The configuration file does not have a valid entry for the indicated configuration item.

Solution:

Check that the file has a correct entry for the configuration item.


195867 clexecd: Unexpected eventmask %x in revents of the control fd.

Description:

clexecd program has encountered an error.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


196233 INTERNAL ERROR: launch_method: method tag <%s> not found in method invocation list for resource group <%s>

Description:

An internal error has occurred. The rgmd will produce a core file and will force the node to halt or reboot to avoid the possibility of data corruption.

Solution:

Save a copy of the /var/adm/messages files on all nodes, and of the rgmd core file. Contact your authorized Sun service provider for assistance in diagnosing the problem.


197307 Resource contains invalid hostnames.

Description:

The hostnames that has to be made available by this logical host resource are invalid.

Solution:

It is advised to keep the hostnames in /etc/inet/hosts file and enable "files" for host lookup in nsswitch.conf file. Any of the following situations might have occured. 1) If hosts are not in /etc/inet/hosts file then make sure the nameserver is reachable and has host name entries specified. 2) Invalid hostnames might have been specified while creating the logical host resource. If this is the case, use the scrgadm command to respecify the hostnames for this logical host resource.


197456 CCR: Fatal error: Node will be killed.

Description:

Some fatal error occured on this node during the synchronization of cluster repository. This node will be killed to allow the synchronization to continue.

Solution:

Look for other messages on this node that indicated the fatal error occured on this node. For example, if the root disk on the afflicted node has failed, then it needs to be replaced.


197997 clexecd: dup2 of stdin returned with errno %d while exec'ing (%s). Exiting.

Description:

clexecd program has encountered a failed dup2(2) system call. The error message indicates the error number for the failure.

Solution:

The clexecd program will exit and the node will be halted or rebooted to prevent data corruption. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


198216 t_bind cannot bind to requested address

Description:

Call to t_bind() failed. The "t_bind" man page describes possible error codes. ucmmd will exit and the node will abort.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


198284 Failed to start fault monitor.

Description:

The fault monitor for this data service was not started. There may be prior messages in syslog indicating specific problems.

Solution:

The user should correct the problems specified in prior syslog messages. This problem may occur when the cluster is under load and Sun Cluster cannot start the application within the timeout period specified. You may consider increasing the Monitor_Start_timeout property. Try switching the resource group to another node using scswitch (1M).


198542 No network resources found for resource.

Description:

No network resources were found for the resource.

Solution:

Declare network resources used by the resource explicitly using the property Network_resources_used. For the resource name and resource group name, check the syslog tag.


198851 fatal: Got error <%d> trying to read CCR when disabling resource <%s>; aborting node

Description:

Rgmd failed to read updated resource from the CCR on this node.

Solution:

Save a copy of the /var/adm/messages files on all nodes, and of the rgmd core file. Contact your authorized Sun service provider for assistance in diagnosing the problem.


199467 clcomm::ObjectHandler::_unreferenced called

Description:

This operation should never be executed.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


201348 Resource <%s> must be in the same resource group as <%s>.

Description:

There is atleast one HAStoragePlus resource, on which this resource depends, that is not in this resource's RG.

Solution:

Move the resources identified by the message into the same RG and retry the operation that caused this error.

Message IDs 200000–299999


201878 clconf: Key length is more than max supported length in clconf_file_io

Description:

In reading configuration data through CCR FILE interface, found the data length is more than max supported length.

Solution:

Check the CCR configuration information.


202528 No permission for group to read %s.

Description:

The group of the file does not have read permission on it.

Solution:

Set the permissions on the file so the group can read it.


203680 fatal: Unable to bind to nameserver

Description:

The low-level cluster machinery has encountered a fatal error. The rgmd will produce a core file and will cause the node to halt or reboot to avoid the possibility of data corruption.

Solution:

Save a copy of the /var/adm/messages files on all nodes, and of the rgmd core file. Contact your authorized Sun service provider for assistance in diagnosing the problem.


203739 Resource %s uses network resource %s in resource group %s, but the property %s for resource group %s does not include resource group %s. This dependency must be set.

Description:

For all network resources used by a scalable resource, a dependency on the resource group containing the network resource should be created for the resource group of the scalable resource.

Solution:

Use the scrgadm(1M) command to update the RG_dependencies property of the scalable resource's resource group to include the resource groups of all network resources that the scalable resource uses.


204163 clcomm: error in copyin for state_balancer

Description:

The system failed a copy operation supporting statistics reporting.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


204342 Apache service with startup script <%s> does not configure %s.

Description:

The specified Apache startup script does not configure the specified variable.

Solution:

Edit the startup script and set the specified variable to the correct value.


204584 clexecd: Going down on signal %d.

Description:

clexecd program got a signal indicated in the error message.

Solution:

clexecd program will exit and node will be halted or rebooted to prevent data corruption. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


205259 Error reading properties; exiting.

Description:

Solution:


205445 check_and_start(): Out of memory

Description:

System runs out of memory in function check_and_start()

Solution:

Install more memory, increase swap space, or reduce peak memory consumption.


205754 All specified device services validated successfully.

Description:

All device services specified directly or indirectly via the GlobalDevicePath and FilesystemMountPoint extension properties respectively are found to be correct. Other Sun Cluster components like DCS, DSDL, RGM are found to be in order. Specified file system mount point entries are found to be correct.

Solution:


205822 clcomm:Cannot vfork() after ORB server initialization.

Description:

A user level process attempted to vfork after ORB server initialization. This is not allowed.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


205873 Permissions incorrect for %s. s bit not set.

Description:

Permissions of $ORACLE_HOME/bin/oracle are expected to be '-rwsr-s--x' (set-group-ID and set-user-ID set). These permissions are set at the time or Oracle installation. FAult monitor will not function correctly without these permissions.

Solution:

Check file permissions. Check Oracle installation. Relink Oracle, if necessary.


206501 CMM: Monitoring re-enabled.

Description:

Transport path monitoring has been enabled back in the cluster, after being disabled.

Solution:

This is an informational message, no user action is needed.


206947 ON_PENDING_MON_DISABLED: bad resource state <%s> (%d) for resource <%s>

Description:

The rgmd state machine has discovered a resource in an unexpected state on the local node. This should not occur and may indicate an internal logic error in the rgmd.

Solution:

Look for other syslog error messages on the same node. Save a copy of the /var/adm/messages files on all nodes, and report the problem to your authorized Sun service provider.


207481 getlocalhostname() failed for resource <%s>, resource group <%s>, method <%s>

Description:

The rgmd was unable to obtain the name of the local host, causing a method invocation to fail. Depending on which method is being invoked and the Failover_mode setting on the resource, this might cause the resource group to fail over or move to an error state.

Solution:

Examine other syslog messages occurring at about the same time to see if the problem can be identified. Save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance in diagnosing the problem.


207510 Some BV servers could not be launched on $HOSTNAME. Check BV logs

Description:

The orbix servers could not be launched. The agent will not return anyerror because the orbix daemon will try to relaunch the servers ata later time.

Solution:

Check for BV logs. Take appropriate action as specified inBroadvision Administration guide if the daemons continue to failand the Dataservice cannot be started.


207615 Property Confdir_list is not set.

Description:

The Confdir_list property is not set. The resource creation is not possible without this property.

Solution:

Set the Confdir_list extension property to the complete path to the WLS home directory. Refer to the COnfiguration guide for more details.


208216 ERROR: resource group <%s> has RG_dependency on non-existent resource group <%s>

Description:

A non-existent resource group is listed in the RG_dependencies of the indicated resource group. This should not occur and may indicate an internal logic error in the rgmd.

Solution:

Look for other syslog error messages on the same node. Save a copy of the /var/adm/messages files on all nodes, and report the problem to your authorized Sun service provider.


208596 clcomm: Path %s being initiated

Description:

A communication link is being established with another node.

Solution:

No action required.


208701 %s error status ignored in step %s

Description:

Ignoring the error status from step execution since this does not affect outcome of the step.

Solution:

None.


209009 Unable to resolve <%s>


209090 scha_control RESOURCE_RESTART failed. error %s

Description:

Fault monitor had detected problems in RDBMS server. Attempt to restart RDBMS server on the same node failed. Error returned by API call scha_control is indicated in the message.

Solution:

None.


209274 path_check_start(): Out of memory

Description:

Run out of memory in function path_check_start().

Solution:

Install more memory, increase swap space, or reduce peak memory consumption.


209314 No permission for group to write %s.

Description:

The group of the file does not have write permission on it.

Solution:

Set the permissions on the file so the group can write it.


210686 Completed Successfully.


210725 Warning: While trying to lookup host %s, the length of the returned address (%d) was longer than expected (%d). The address will be truncated.

Description:

The value of the resolved address for the named host was longer than expected.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


210975 Stop monitoring saposcol under PMF times out.

Description:

Stopping monitoring the SAP OS collector process under the control of Process Monitor facility times out. This might happen under heavy system load.

Solution:

You might consider increase the monitor stop time out value.


211198 Completed successfully.

Description:

Data service method completed successfully.

Solution:

No action required.


211873 pmf_search_children: pmf_remove_triggers: %s

Description:

The rpc.pmfd server was not able to monitor a process, and the system error is shown. An error message is output to syslog.

Solution:

Save the syslog messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


212337 (%s) scan of seqnum failed on "%s", ret = %d

Description:

Could not get the sequence number from the udlm message received.

Solution:

This is an internal error. Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.


212697 CMM: Open failed with error '(%s)' and errno = %d for quorum device %ld with gdevname '%s'.

Description:

The open operation on the specified quorum device failed, and this node will ignore the quorum device.

Solution:

The quorum device has failed or the path to this device may be broken. Refer to the quorum disk repair section of the administration guide for resolving this problem.


213112 latch_intention(): IDL exception when communicating to node %d

Description:

An inter-node communication failed, probably because a node died.

Solution:

No action is required; the rgmd should recover automatically.


213583 Failed to stop Backup server.

Description:

Sun Cluster HA for Sybase failed to stop backup server using KILL signal.

Solution:

Please examine whether any Sybase server processes are running on the server. Please manually shutdown the server.


213599 Failed to stop backup server.

Description:

Sun Cluster HA for Sybase failed to stop backup server using KILL signal.

Solution:

Please examine whether any Sybase server processes are running on the server. Please manually shutdown the server.


213973 Resource depends on a SUNW. HAStoragePlus type resource that is not online anywhere.

Description:

The resource depends on a SUNW. HAStoragePlus resource that is not online on any cluster node.

Solution:

Bring all SUNW. HAStoragePlus resources, that this HA-NFS resource depends on, online before performing the operation that caused this error.


213991 No Network resource in resource group.

Description:

This message indicates that there is no network resource configured for a "network-aware" service. A network aware service can not function without a network address.

Solution:

Configure a network resource and retry the command.


215525 Failed to stop orbixd.

Description:

The orbix daemon could not be stopped by the stop method. This could be an internal error also.

Solution:

Kill orbixd manually and also clear all the BV processes running onthe node where the resource failed. If all the resources on the node, where the stop failed, are turned off delete the/var/run/cluster/bv/bv_orbixd_lock_file file .


215538 Not all hostnames brought online.

Description:

Failed to bring all the hostnames online. Only some of the ip addresses are online.

Solution:

Use ifconfig command to make sure that the ip addresses are available. Check for any error message before this error message for a more precise reason for this error. Use scswitch command to move the resource group to a different node. If problem persists, reboot.


216087 rebalance: resource group <%s> is being switched updated or failed back, cannot assign new primaries

Description:

The indicated resource group has lost a master due to a node death. However, the RGM is unable to switch the resource group to a new master because the resource group is currently in the process of being modified by an operator action, or is currently in the process of "failing back" onto a node that recently joined the cluster.

Solution:

Use scstat(1M) -g to determine the current mastery of the resource group. If necessary, use scswitch(1M) -z to switch the resource group online on desired nodes.


216217 %s failed to complete.

Description:

Solution:


216244 CCR: Table %s has invalid checksum field. Reported: %s, actual: %s.

Description:

The indicated table has an invalid checksum that does not match the table contents. This causes the consistency check on the indicated table to fail.

Solution:

Boot the offending node in -x mode to restore the indicated table from backup or other nodes in the cluster. The CCR tables are located at /etc/cluster/ccr/.


216379 Stopping fault monitor using pmfadm tag %s

Description:

Informational message. Fault monitor will be stopped using Process Monitoring Facility (PMF), with the tag indicated in message.

Solution:

None


216774 WARNING: update_state:udlm_send_reply failed

Description:

A warning for udlm state update and results in udlm abort.

Solution:

This is an internal error. Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.


217093 Call failed: %s

Description:

A client was not able to make an rpc connection to a server (rpc.pmfd, rpc.fed or rgmd) to execute the action shown. The rpc error message is shown. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


217713 Unable to fork().


218227 Error accessing policy string

Description:

This message appears when the customer is initializing or changing a scalable services load balancer, by starting or updating a service. The Load_Balancing_String is missing.

Solution:

Add a Load_Balancing_String parameter when creating the resource group.


218575 low memory: unable to capture output

Description:

The rpc.fed server was not able to allocate memory necessary to capture the output from methods it runs.

Solution:

Investigate if the host is running out of memory. If not save the /var/adm/messages file. whether a workaround or patch is available.


218780 Stopping the monitor server.

Description:

The Monitor server is about to be brought down by Sun ClusterHA for Sybase.

Solution:

This is an information message, no user action is needed.


219058 Failed to stop the backup server using %s.

Description:

Sun Cluster HA for Sybase failed to stop the backup server using the file specified in the STOP_FILE property. Other syslog messages and the log file will provide additional information on possible reasons for the failure.

Solution:

Please check the permissions of file specified in the STOP_FILE extension property. File should be executable by the Sybase owner and root user.


219930 Cannot determine if the server is secure: assuming secure.

Description:

While attempting to determine if the Netscape server is running under secure or non-secure mode an error occurred. This error results in the Data Service assuming a secure Netscape server, and will probe the server as such.

Solution:

This message is issued after an internal error has occurred. Refer to syslog for that message.


220753 %s: Arg error, invalid tag <%s> restarting service.

Description:

The PMF tag supplied as argument to the PMF action script is not a tag generated by the DSDL; the PMF action script has restarted the application.

Solution:

This is an internal error. Contact your authorized Sun service provider for assistance in diagnosing and correcting the problem.


220849 CCR: Create table %s failed.

Description:

The CCR failed to create the indicated table.

Solution:

The failure can happen due to many reasons, for some of which no user action is required because the CCR client in that case will handle the failure. The cases for which user action is required depends on other messages from CCR on the node, and include: If it failed because the cluster lost quorum, reboot the cluster. If the root file system is full on the node, then free up some space by removing unnecessary files. If the root disk on the afflicted node has failed, then it needs to be replaced. If the cluster repository is corrupted as indicated by other CCR messages, then boot the offending node(s) in -x mode to restore the cluster repository backup. The cluster repository is located at /etc/cluster/ccr/.


221314 INTERNAL ERROR: usage: $0 <gateway_or_server_root>

Description:

An internal error has occurred.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


221746 Failed to retrieve property %s: %s.

Description:

The PMF action script supplied by the DSDL could not retrieve the given property of the resource.

Solution:

Check the syslog messages around the time of the error for messages indicating the cause of the failure. If this error persists, contact your authorized Sun service provider for assistance in diagnosing and correcting the problem.


222512 fatal: could not create death_ff

Description:

The daemon indicated in the message tag (rgmd or ucmmd) was unable to create a failfast device. The failfast device kills the node if the daemon process dies either due to hitting a fatal bug or due to being killed inadvertently by an operator. This is a requirement to avoid the possibility of data corruption. The daemon will produce a core file and will cause the node to halt or reboot.

Solution:

Save a copy of the /var/adm/messages files on all nodes, and of the core file generated by the daemon. Contact your authorized Sun service provider for assistance in diagnosing the problem.


223145 gethostbyname failed for (%s)

Description:

Failed to get information about a host. The "gethostbyname" man page describes possible reasons.

Solution:

Make sure entries in /etc/hosts, /etc/nsswitch.conf and /etc/netconfig are correct to get information about this host.


223458 INTERNAL ERROR CMM: quorum_algorithm_init called already.

Description:

This is an internal error during node initialization, and the system can not continue.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


224682 Failed to initialize the probe history.

Description:

A process has failed to allocate memory for the probe history structure, most likely because the system has run out of swap space.

Solution:

To solve this problem, increase swap space by configuring additional swap devices. See swap(1M) for more information.


224718 Failed to create scalable service in group %s for IP %s Port %d%c%s: %s.

Description:

A call to the underlying scalable networking code failed. This call may fail because the IP, Port, and Protocol combination listed in the message conflicts with the configuration of an existing scalable resource. A conflict can occur if the same combination exists in a scalable resource that is already configured on the cluster. A combination may also conflict if there is a resource that uses Load_balancing_policy LB_STICKY_WILD with the same IP address as a different resource that also uses LB_STICKY_WILD.

Solution:

Try using a different IP, Port, and Protocol combination. Otherwise, save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


224783 clcomm: Path %s has been deleted

Description:

A communication link is being removed with another node. The interconnect may have failed or the remote node may be down.

Solution:

Any interconnect failure should be resolved, and/or the failed node rebooted.


225296 "pmfadm -s": Can not stop <%s>: Monitoring is not resumed on pid %d

Description:

The command 'pmfadm -s' can not be executed on the given tag because the monitoring is suspended on the indicated pid.

Solution:

Resume the monitoring on the indicated pid with the 'pmfctl -R' command.


225882 Internal: Unknown command type (%d)

Description:

An internal error has occurred in the rgmd while trying to connect to the rpc.fed server.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


226280 PNM daemon exiting

Description:

The PNM daemon (pnmd) is shutting down.

Solution:

This message is informational; no user action is needed.


226747 Unregister callback with NAFO %s failed. Error %d.

Description:

Solution:


226914 scswitch: internal error: bad nodename %s in nodelist of resource group %s

Description:

The indicated resource group's Nodelist property, as stored in the CCR, contains an invalid nodename. This might indicate corruption of CCR data or rgmd in-memory state. The scswitch command will fail.

Solution:

Use scstat(1M) -g and scrgadm(1M) -pvv to examine resource group properties. If the values appear corrupted, the CCR might have to be rebuilt. If values appear correct, this may indicate an internal error in the rgmd. Contact your authorized Sun service provider for assistance in diagnosing and correcting the problem.


226907 Failed to run the DB probe script %s

Description:

This is an internal error.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


227214 Error: duplicate method <%s> launched on resource <%s> in resource group <%s>

Description:

Due to an internal error, the rgmd state machine has attempted to launch two different methods on the same resource on the same node, simultaneously. The rgmd will reject the second attempt and treat it as a method failure.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


227445 Unable to create directory %s: %s.


227820 Attempting to stop the data service running under process monitor facility.

Description:

The function is going to request the PMF to stop the data service. If the request fails, refer to the syslog messages that appear after this message.

Solution:

This is an informational message, no user action is required.


228021 Failed to retrieve the extension property <%s> for NetBackup, error : %s.

Description:

The extension <%s> is missing in the RTR File.

Solution:

Serious error. The RTR file may be corrupted. Reload the package for HA-NetBackup SUNWscnb. If problem persists contact the Sun Cluster HA developer.


228212 reservation fatal error(%s) - unable to get local node id

Description:

The device fencing program has suffered an internal error.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available. Copies of /var/adm/messages from all nodes should be provided for diagnosis. It may be possible to retry the failed operation, depending on the nature of the error. If the message specifies the 'node_join' transition, then this node may be unable to access shared devices. If the failure occurred during the 'release_shared_scsi2' transition, then a node which was joining the cluster may be unable to access shared devices. In either case, it may be possible to reacquire access to shared devices by executing '/usr/cluster/lib/sc/run_reserve -c node_join' on all cluster nodes. If the failure occurred during the 'make_primary' transition, then a device group has failed to start on this node. If another node was available to host the device group, then it should have been started on that node. If desired, it may be possible to switch the device group to this node with the scswitch command. If no other node was available, then the device group will not have been started. The scswitch command may be used to retry the attempt to start the device group. If the failure occurred during the 'primary_to_secondary' transition, then the shutdown or switchover of a device group has failed. The desired action may be retried.


228399 Unable to stop processes running under PMF tag %s.

Description:

Sun Cluster HA for Sybase failed to stop processes using Process Monitoring Facility. Please examine if the PMF tag indicated in the message exists on the node. (command: pmfadm -l <tag>). Other syslog messages and the log file will provide additional information on possible reasons for the failure.

Solution:

Please examine Sybase logs and syslog messages. If this message is seen under heavy system load, it will be necessary to increase Stop_timeout property of the resource.


228461 CMM: Issuing a SCSI2 Release failed on quorum device %s with error %d.

Description:

This node encountered the specified error while issuing a SCSI2 Release operation on the specified quorum device. The quorum code will either retry this operation or will ignore this quorum device.

Solution:

There may be other related messages that may provide more information regarding the cause of this problem. SCSI2 operations fail with an error code of EACCES if SCSI3 keys are present on the device. Scrub the SCSI3 keys off of the quorum device.


231556 %s can't plumb %s: out of instance numbers on %s

Description:

This means that we have reached the maximum number of Logical IPs allowed on an adapter.

Solution:

This can be increased by increasing the ndd variable ip_addrs_per_if. However the maximum limit is 8192. The default is 256.


231770 ns: Could not initialize ORB: %d

Description:

could not initialize ORB.

Solution:

Please make sure the nodes are booted in cluster mode.


231991 WARNING: lkcm_dreg: udlm_send_reply failed

Description:

Could not deregister udlm with ucmm.

Solution:

None.


232201 Invalid port number returned.

Description:

Invalid port number was retrieved for the Port_list property of the resource.

Solution:

Any of the following situations may occur. Different user action is required for these different scenarios. 1) If a new resource has created or updated, check whether it has valid port number. If port number is not valid, provide valid port number using scrgadm(1M) command. 2) Check the syslog messages that have occurred just before this message. If it is "Out of memory" problem, then correct it. 3) For all other cases, treat it as an Internal error. Contact your authorized Sun service provider.


232501 Validation failed. ORACLE_HOME/bin/svrmgrl not found ORACLE_HOME=%s

Description:

Oracle binaries (svrmgrl) not found in ORACLE_HOME/bin directory. ORACLE_HOME specified for the resource is indicated in the message. HA-Oracle will not be able to manage resource if ORACLE_HOME is incorrect.

Solution:

Specify correct ORACLE_HOME when creating resource. If resource is already created, please update resource property 'ORACLE_HOME'.


232565 Scalable services enabled.

Description:

This means that the scalable services framework is set up in the cluster. Specifically, is is printed out for the node that has joined the cluster and for which services have been downloaded. Once the services have been downloaded, those services are ready to participate as scalable services.

Solution:

This is an informational message, no user action is needed.


232920 -d must be followed by a hex bitmask

Description:

Incorrect arguments used while setting up sun specific startup parameters to the Oracle unix dlm.

Solution:

This is an internal error. Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.


233017 Successfully stopped %s.

Description:

The resource was successfully stopped by Sun Cluster.

Solution:

No user action is required.


233053 SharedAddress offline.

Description:

The status of the sharedaddress resource is offline.

Solution:

This is informational message. No user action required.


233327 Switchover (%s) error: failed to mount FS (%d)

Description:

The file system specified in the message could not be hosted on the node the message came from.

Solution:

Check /var/adm/messages to make sure there were no device errors. If not, contact your authorized Sun service provider to determine whether a workaround or patch is available.


233956 Error in reading message in child process: %m

Description:

Error occurred when reading message in fault monitor child process. Child process will be stopped and restarted.

Solution:

If error persists, then disable the fault monitor and resport the problem.


233961 scvxvmlg error - symlink(%s, %s) failed

Description:

The program responsible for maintaining the VxVM namespace was unable to access the global device namespace. If configuration changes were recently made to VxVM diskgroups or volumes, this node may be unaware of those changes. Recently created volumes may be unaccessible from this node.

Solution:

Verify that the /global/.devices/node@N (N = this node's node number) is mounted globally and is accessible. If no configuration changes have been recently made to VxVM diskgroups or volumes and all volumes continue to be accessible from this node, then no further action is required. If changes have been made, the device namespace on this node can be updated to reflect those changes by executing '/usr/cluster/lib/dcs/scvxvmlg'. If the problem persists, contact your authorized Sun service provider to determine whether a workaround or patch is available.


234438 INTERNAL ERROR: Invalid resource property type <%d> on resource <%s>; aborting node

Description:

An attempted creation or update of a resource has failed because of invalid resource type data. This may indicate CCR data corruption or an internal logic error in the rgmd. The rgmd will produce a core file and will force the node to halt or reboot.

Solution:

Use scrgadm(1M) -pvv to examine resource properties. If the resource or resource type properties appear to be corrupted, the CCR might have to be rebuilt. If values appear correct, this may indicate an internal error in the rgmd. Re-try the creation or update operation. If the problem recurs, save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance.


234463 INTERNAL ERROR: process_resource: resource group <%s> is pending_mon_disable but contains resource <%s> in STOP_FAILED state

Description:

During a resource monitor disable (scswitch -M -n), the rgmd has discovered a resource in STOP_FAILED state. This may indicate an internal logic error in the rgmd, since updates are not permitted on the resource group until the STOP_FAILED error condition is cleared.

Solution:

Look for other syslog error messages on the same node. Save a copy of the /var/adm/messages files on all nodes, and report the problem to your authorized Sun service provider.


235455 inet_ntop: %s

Description:

Solution:


235455 inet_ntop: %s

Description:

Solution:


235707 pmf_monitor_suspend: pmf_remove_triggers: %s

Description:

The rpc.pmfd server was not able to suspend the monitoring of a process and the monitoring of the process has been aborted. The message contains the system error.

Solution:

Save the syslog messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


236733 lookup of oracle dba gid failed.

Description:

Could not find group id for dba. udlm will not startup.

Solution:

Make sure /etc/nswitch.conf and /etc/group files are valid and have correct information to get the group id of dba.


237149 clcomm: Path %s being constructed

Description:

A communication link is being established with another node.

Solution:

No action required.


237547 Adaptive server stopped (nowait).

Description:

The Sybase adaptive server been been stopped by Sun Cluster HA for Sybase using the nowait option.

Solution:

This is an informational message, no user action is needed.


237724 Failed to retrieve hostname: %s.

Description:

The call back method has failed to determine the hostname. Now the callback methods will be executed in /var/core directory.

Solution:

No user action is needed. For detailed error message, look at the syslog message.


237744 SAP was brought up outside of HA-SAP, HA-SAP will not shut it down.

Description:

SAP was started up outside of the control of Sun cluster. It will not be shut down automatically.

Solution:

Need to shut down SAP, before trying to start up SAP under the control of Sun Cluster.


237781 WARNING: lkcm_sync: %d returned from udlm_recv_message


239109 Validation failed. Resource property RESOURCE_DEPENDENCIES should contain at least the rac_framework resource

Description:

The resource being created or modified should be dependent upon the rac_framework resource in the RAC framework resource group.

Solution:

If not already created, create the RAC framework resource group and it's associated resources. Then specify the rac_framework resource for this resource's RESOURCE_DEPENDENCIES property.


239415 Failed to retrieve the cluster handle: %s.

Description:

Access to the object named failed. The reason for the failure is given in the message.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


239735 Couldn't parse policy string %s

Description:

This message appears when the customer is initializing or changing a scalable services load balancer, by starting or updating a service. The Load_Balancing_String is invalid.

Solution:

Check the Load_Balancing_String value specified when creating the resource group and make sure that a valid value is used.


240376 No protocol was given as part of property %s for element %s. The property must be specified as %s=PortNumber%cProtocol,PortNumber%cProtocol,...

Description:

The property named does not have a legal value.

Solution:

Assign the property a legal value.


240388 Prog <%s> step <%s>: timed out.

Description:

A step has exceeded its configured timeout and was killed by ucmmd. This in turn will cause a reconfiguration of OPS.

Solution:

Other syslog messages occurring just before this one might indicate the reason for the failure. After correcting the problem that caused the step to fail, the operator may retry reconfiguration of OPS.


240529 Listener status probe failed with exit code %s.

Description:

An attempt to query the status of the Oracle listener using command 'lsnrctl status <listener_name>' failed with the error code indicated. HA-Oracle will attempt to kill the listener and then restart it.

Solution:

None, HA-Oracle will attempt to restart the listener. However, the cause of the failure should be investigated further. Examine the log file and syslog messages for additional information.


241147 Invalid value %s for property %s.

Description:

An invalid value was supplied for the property.

Solution:

Supply "conf" or "boot" as the value for DNS_mode property.


241441 clexecd: ioctl(I_RECVFD) returned %d. Returning %d to clexecd.

Description:

clexecd program has encountered a failed ioctl(2) system call. The error message indicates the error number for the failure.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


241630 File %s should be readable only by the owner %s.

Description:

The specified file is expected to be readable only by the specified user, who is the owner of the file.

Solution:

Make sure that the specified file has correct permissions permissions by running 'chmod 600 <file>' for non-executable files, or 'chmod 700 <file>' for executable files.


241948 Failed to retrieve resource <%s> extension property <%s>: %s.

Description:

An internal error occurred in the rgmd while checking a resource property.

Solution:

Look for other syslog error messages on the same node. Save a copy of the /var/adm/messages files on all nodes, and report the problem to your authorized Sun service provider.


242214 clexecd: fork1 returned %d. Returning %d to clexecd.

Description:

clexecd program has encountered a failed fork1(2) system call. The error message indicates the error number for the failure.

Solution:

If the error number is 12 (ENOMEM), install more memory, increase swap space, or reduce peak memory consumption. If error number is something else, contact your authorized Sun service provider to determine whether a workaround or patch is available.


242385 Failed to stop liveCache with command %s. Return code from SAP command is %d.

Description:

Failed to shutdown liveCache immediately with listed command. The return code from the SAP command is listed.

Solution:

Check the return code for dbmcli command for reasons of failure. Also, check the SAP log files for more details.


243444 CMM: Issuing a SCSI2 Tkown failed for quorum device with error %d.

Description:

This node encountered the specified error while issuing a SCSI2 Tkown operation on a quorum device. This will cause the node to conclude that it has been unsuccessful in preempting keys from the quorum device, and therefore the partition to which it belongs has been preempted. If a cluster gets divided into two or more disjoint subclusters, exactly one of these must survive as the operational cluster. The surviving cluster forces the other subclusters to abort by grabbing enough votes to grant it majority quorum. This is referred to as preemption of the losing subclusters.

Solution:

There will be other related messages that will identify the quorum device for which this error has occurred. If the error encountered is EACCES, then the SCSI2 command could have failed due to the presence of SCSI3 keys on the quorum device. Scrub the SCSI3 keys off of it, and reboot the preempted nodes.


243639 Scalable service instance [%s,%s,%d] deregistered on node %s.

Description:

The specified scalable service had been deregistered on the specified node. Now, the gif node cannot redirect packets for the specified service to this node.

Solution:

This is an informational message, no user action is needed.


243965 udlm_ack_msg: udp is null!

Description:

Can not acknowledge a message received from udlmctl because the address to acknowledge to is null.

Solution:

None.


243996 Failed to retrieve resource <%s> extension property <%s>

Description:

Can not get extension property.

Solution:

Look for other syslog error messages on the same node. Save a copy of the /var/adm/messages files on all nodes, and report the problem to your authorized Sun service provider.


244116 clcomm: socreate on routing socket failed with error = %d

Description:

The system prepares IP communications across the private interconnect. A socket create operation on the routing socket failed.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


244218 Could not stop the BV processes on $HOSTNAME

Description:

The BV processes on the specified host could not be stopped.

Solution:

Check if the BV processes could be stopped manually. Save /var/adm/messages and other relevant BV logs and contact sun support.


245128 Extension property %s must be set.

Description:

The indicated property must be set by the user.

Solution:

Use scrgadm to set the property.


245186 reservation warning(%s) - MHIOCGRP_PREEMPTANDABORT error will retry in %d seconds

Description:

The device fencing program has encountered errors while trying to access a device. The failed operation will be retried

Solution:

This is an informational message, no user action is needed.


245194 Error with hosts.allow or hosts.deny

Description:

Solution:


246769 kill -TERM: %s

Description:

The rpc.fed server is not able to kill a tag that timed out, and the error message is shown. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Examine other syslog messages occurring around the same time on the same node, to see if the cause of the problem can be identified.


246999 Failed to retrieve host information for %s: %s.


247405 Failed to take the resource out of PMF control. Will shutdown WLS using sigkill

Description:

The resource could not be taken out of pmf control. The WLS will however be shutdown by killing the process using sigkill.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


247682 recv_message: cm_reconfigure: %s

Description:

udlm received a message to reconfigure.

Solution:

None. OPS is going to reconfigure.


247752 Failed to start the service %s.

Description:

Specified data service failed to start.

Solution:

Look in /var/adm/messages for the cause of failure. Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


247868 in libsecurity rpcb_getaddr program number %d failed

Description:

The rpc.pmfd, rpc.fed or rgmd server was not able to call rpcb_getaddr, which is used to cache rpcbind information. The affected component should continue to function by calling rpcbind directly.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


248031 scvxvmlg warning - %s does not exist, creating it

Description:

The program responsible for maintaining the VxVM device namespace has discovered inconsistencies between the VxVM device namespace on this node and the VxVM configuration information stored in the cluster device configuration system. If configuration changes were made recently, then this message should reflect one of the configuration changes. If no changes were made recently or if this message does not correctly reflect a change that has been made, the VxVM device namespace on this node may be in an inconsistent state. VxVM volumes may be inaccessible from this node.

Solution:

If this message correctly reflects a configuration change to VxVM diskgroups then no action is required. If the change this message reflects is not correct, then the information stored in the device configuration system for each VxVM diskgroup should be examined for correctness. If the information in the device configuration system is accurate, then executing '/usr/cluster/lib/dcs/scvxvmlg' on this node should restore the device namespace. If the information stored in the device configuration system is not accurate, it must be updated by executing '/usr/cluster/bin/scconf -c -D name=diskgroup_name' for each VxVM diskgroup with inconsistent information.


249804 INTERNAL ERROR CMM: Failure creating sender thread.

Description:

An instance of the userland CMM encountered an internal initialization error. This is caused by inadequate memory on the system.

Solution:

Add more memory to the system. If that does not resolve the problem, contact your authorized Sun service provider to determine whether a workaround or patch is available.


249934 Method <%s> failed to execute on resource <%s> in resource group <%s>, error: <%d>

Description:

A resource method failed to execute, due to a system error number identified in the message. The indicated error number appears not to match any of the known errno values described in intro(2). This is considered a method failure. Depending on which method is being invoked and the Failover_mode setting on the resource, this might cause the resource group to fail over or move to an error state, or it might cause an attempted edit of a resource group or its resources to fail.

Solution:

Other syslog messages occurring at about the same time might provide evidence of the source of the problem. If not, save a copy of the /var/adm/messages files on all nodes, and (if the rgmd did crash) a copy of the rgmd core file, and contact your authorized Sun service provider for assistance.


250047 Failed to start Broadvision servers on %s.

Description:

The Broadvision servers could not be started on the specified host. This could happen if the orbixd didnot get started properly or if there are any configurationerrors.

Solution:

See if there are any internal errors. Look at the Broadvisionconfiguration and check if everything is OK. Try to startBV on the specified host manually and check if it can bestarted properly. If orbixd could be started manually butcouldnt be started under HA contact sun support with/var/adm/messages and other BV logs.


250133 Failed to open the device %s: %s.

Description:

This is an internal error. System failed to perform the specified operation.

Solution:

For specific error information check the syslog message. Provide the following information to your authorized Sun service provider to diagnose the problem. 1) Saved copy of /var/adm/messages file 2) Output of "ls -l /dev/sad" command 3) Output of "modinfo | grep sad" command.


250151 write: %s

Description:

Solution:


250387 Stop fault monitor using pmfadm failed. tag %s error=%s.

Description:

The Process Monitoring Facility could not stop the Sun ClusterHA for Sybase fault monitor. The fault monitor tag is providedin the message. The error returned by the PMF is indicatedin the message.

Solution:

Stop the fault monitor processes. Contact your authorized Sun Serviceprovider to report this problem.


251472 Validation failed. SYBASE directory %s does not exist.

Description:

The indicated directory does not exist. The SYBASE environmentvariable may be incorrectly set or the installation may be incorrect.

Solution:

Check the SYBASE environment variable value and verify the Sybaseinstallation.


251552 Failed to validate configuration.

Description:

The data service is not properly configured.

Solution:

Look at the prior syslog messages for specific problems and take corrective action.


251702 Error initializing an internal component of the version manager (error %d).

Description:

This message can occur when the system is booting if incompatible versions of cluster software are installed.

Solution:

Verify that any recent software installations completed without errors and that the installed packages or patches are compatible with the rest of the installed software. Also, contact your authorized Sun service provider to determine whether a workaround or patch is available.


252457 The %s command does not have execute permissions: <%s>

Description:

This command input to the agent builder does not have the expected default execute permissions.

Solution:

Reset the permissions to allow execute permissions using the chmod command.


254053 Initialization error. Fault Monitor password is NULL

Description:

Internal error. Environment variable SYBASE_MONITOR_PASSWORD not set before invoking fault monitor.

Solution:

Report this problem to your authorized Sun service provider.


254131 resource group %s removed.

Description:

This is a notification from the rgmd that the operator has deleted a resource group. This may be used by system monitoring tools.

Solution:

This is an informational message, no user action is needed.


254388 Failed to retrieve Message server pid.

Description:

Failed to retrieve the process ID for the message server indicating the message server process is not running.

Solution:

No action needed. The fault monitor will detect this and take appropriate action.


254692 scswitch: internal error: bad state <%s> (<%d>) for resource group <%s>

Description:

While attempting to execute an operator-requested switch of the primaries of a resource group, the rgmd has discovered the indicated resource group to be in an invalid state. The switch action will fail.

Solution:

This may indicate an internal error or bug in the rgmd. Contact your authorized Sun service provider for assistance in diagnosing and correcting the problem.


255071 Low memory: unable to process client registration

Description:

Solution:


255115 Retrying to retrieve the resource type information.

Description:

An update to cluster configuration occured while resource type properties were being retrieved

Solution:

Ignore the message.


255929 in libsecurity authsys_create_default failed

Description:

A client was not able to make an rpc connection to a server (rpc.pmfd, rpc.fed or rgmd) because it failed the authentication process. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


256023 pmf_monitor_suspend: Error opening procfs status file <%s> for tag <%s>: %s

Description:

The rpc.pmfd server was not able to open a procfs status file, and the system error is shown. procfs status files are required in order to monitor user processes. This error occurred for a process whose monitoring had been suspended. The monitoring of this process has been aborted and can not be resumed.

Solution:

Investigate if the machine is running out of memory. If this is not the case, save the syslog messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


257965 File %s should be readable by %s.

Description:

A program required the specified file to be readable by the specified user.

Solution:

Set correct permissions for the specified file to allow the specified user to read it.


258357 Method <%s> failed to execute on resource <%s> in resource group <%s>, error: <%s>

Description:

A resource method failed to execute, due to a system error described in the message. For an explanation of the error message, consult intro(2). This is considered a method failure. Depending on which method was being invoked and the Failover_mode setting on the resource, this might cause the resource group to fail over or move to an error state, or it might cause an attempted edit of a resource group or its resources to fail.

Solution:

If the error message is not self-explanatory, other syslog messages occurring at about the same time might provide evidence of the source of the problem. If not, save a copy of the /var/adm/messages files on all nodes, and (if the rgmd did crash) a copy of the rgmd core file, and contact your authorized Sun service provider for assistance.


258909 clexecd: sigfillset returned %d. Exiting.

Description:

clexecd program has encountered a failed sigfillset(3C) system call. The error message indicates the error number for the failure.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


259455 in fe_set_env_vars malloc failed

Description:

The rgmd server was not able to allocate memory for the environment name, while trying to connect to the rpc.fed server, possibly due to low memory. An error message is output to syslog.

Solution:

Investigate if the host is running out of memory. If not save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


259810 reservation error(%s) - do_scsi3_reserve() error for disk %s

Description:

The device fencing program has encountered errors while trying to access a device. All retry attempts have failed.

Solution:

For the user action required by this message, see the user action for message 192619.


261123 resource group %s state change to managed.

Description:

This is a notification from the rgmd that a resource group's state has changed. This may be used by system monitoring tools.

Solution:

This is an informational message, no user action is needed.


262295 Failback bailing out because resource group <%s>is being updated or switched

Description:

The rgmd was unable to failback the specified resource group to a more preferred node because the resource group was already in the process of being updated or switched.

Solution:

This is an informational message, no user action is needed.


262898 Name service not available.

Description:

The monitor_check method detected that name service is not responsive.

Solution:

Check if name service is configured correctly. Try some commands to query name serves, such as ping and nslookup, and correct the problem. If the error still persists, then reboot the node.


263110 Error deleting PidLog <%s> (%s) for iPlanet service with config file <%s>

Description:

The data service was not able to delete the specified PidLog file.

Solution:

Delete the PidLog file manually and start the the resource group.


263258 CCR: More than one copy of table %s has the same version but different checksums. Using the table from node %s.

Description:

The CCR detects that two valid copies of the indicated table have the same version but different contents. The copy on the indicated node will be used by the CCR.

Solution:

This is an informational message, no user action is needed.


263606 unpack_rg_seq: rname_to_r error <%s>

Description:

Due to an internal error, the rgmd was unable to find the specified resource data in memory.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


263992 Successfully started NFS service.


264326 Error (%s) when reading extension property <%s>.

Description:

Error occurred in API call scha_resource_get.

Solution:

Check syslog messages for errors logged from other system modules. If error persists, please report the problem.


265925 CMM: Cluster lost operational quorum; aborting.

Description:

Not enough nodes are operational to maintain a majority quorum, causing the cluster to fail to avoid a potential split brain.

Solution:

The nodes should rebooted.


266059 security_svc_reg failed.

Description:

The rpc.pmfd server was not able to initialize authentication and rpc initialization. This happens while the server is starting up, at boot time. The server does not come up, and an error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


266274 %s: No memory, restarting service.

Description:

The PMF action script supplied by the DSDL could not complete its function because it could not allocate the required amount of memory; the PMF action script has restarted the application.

Solution:

If this error persists, contact your authorized Sun service provider for assistance in diagnosing and correcting the problem.


266834 CMM: Our partition has been preempted.

Description:

The cluster partition to which this node belongs has been preempted by another partition during a reconfiguration. The preempted partition will abort. If a cluster gets divided into two or more disjoint subclusters, exactly one of these must survive as the operational cluster. The surviving cluster forces the other subclusters to abort by grabbing enough votes to grant it majority quorum. This is referred to as preemption of the losing subclusters.

Solution:

There may be other related messages that may indicate why quorum was lost. Determine why quorum was lost on this node partition, resolve the problem and reboot the nodes in this partition.


267558 Error when reading property %s.

Description:

Unable to read property value using API. Property name is indicated in message. Syslog messages may give more information on errors in other modules.

Solution:

Check syslog messages. Please report this problem.


267589 launch_fed_prog: call to rpc.fed failed for program <%s>, step <%s>

Description:

Launching of fed program failed due to a failure of ucmmd to communicate with the rpc.fed daemon. If the rpc.fed process died, this might lead to a subsequent reboot of the node.

Solution:

Examine other syslog messages occurring at about the same time to see if the problem can be identified and if it recurs. Save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance.


267673 Validation failed. ORACLE binaries not found ORACLE_HOME=%s

Description:

Oracle binaries not found under ORACLE_HOME. ORACLE_HOME specified for the resource is indicated in the message. HA-Oracle will not be able to manage Oracle if ORACLE_HOME is incorrect.

Solution:

Specify correct ORACLE_HOME when creating resource. If resource is already created, please update resource property 'ORACLE_HOME'.


267724 stat of file system %s failed.

Description:

HA-NFS fault monitor reports a probe failure on a specified file system.

Solution:

Make sure the specified path exists.


268593 Failed to take the resource out of PMF control. Sending SIGKILL now.

Description:

An error was encounterd while taking the resource out of PMF's control while stopping the resource. The resource will be stopped by sending it SIGKILL.

Solution:

This message is informational; no user action is needed.


269027 sigfillset: %s

Description:

Solution:


269240 clconf: Write_ccr routine shouldn't be called from kernel

Description:

Routine write_ccr that writes a clconf tree out to CCR should not be called from kernel.

Solution:

No action required. This is informational message.


269902 reservation fatal error(%s) - Unable to find gdev property

Description:

A required rawdisk device group property is missing.

Solution:

Executing '/usr/cluster/bin/scgdevs -L' on this node should generate the required property. If this successfully creates the required property, it should be possible to retry the failed operation. If the message specifies the 'node_join' transition, then this node may be unable to access shared devices. If the failure occurred during the 'release_shared_scsi2' transition, then a node which was joining the cluster may be unable to access shared devices. In either case, it may be possible to reacquire access to shared devices by executing '/usr/cluster/lib/sc/run_reserve -c node_join' on all cluster nodes. If the failure occurred during the 'make_primary' transition, then a device group has failed to start on this node. If another node was available to host the device group, then it should have been started on that node. If desired, it may be possible to switch the device group to this node with the scswitch command. If no other node was available, then the device group will not have been started. The scswitch command may be used to retry the attempt to start the device group. If the failure occurred during the 'primary_to_secondary' transition, then the shutdown or switchover of a device group has failed. The desired action may be retried. If the problem persists, contact your authorized Sun service provider to determine whether a workaround or patch is available.


270043 reservation warning(%s) - MHIOCENFAILFAST error will retry in %d seconds

Description:

The device fencing program has encountered errors while trying to access a device. The failed operation will be retried

Solution:

This is an informational message, no user action is needed.


272139 Message Server Process is not running. pid was %d.

Description:

Message server process is not present on the process list indicating message server process is not running on this node.

Solution:

No action needed. Fault monitor will detect that message server process is not running, and take appropriate action.


272238 reservation warning(%s) - MHIOCGRP_RESERVE error will retry in %d seconds

Description:

The device fencing program has encountered errors while trying to access a device. The failed operation will be retried

Solution:

This is an informational message, no user action is needed.


272434 Validation failed. SYBASE text server startup file RUN_%s not found SYBASE=%s.

Description:

Text server was specified in the extension property Text_Server_Name. However, text server startup file was not found. Text server startup file is expected to be: $SYBASE/$SYBASE_ASE/install/RUN_<Text_Server_Name>

Solution:

Check the text server name specified in the Text_Server_Name property. Verify that SYBASE and SYBASE_ASE environment variables are set property in the Environment_file. Verify that RUN_<Text_Server_Name> file exists.


272732 scvxvmlg warning - chmod(%s) failed

Description:

The program responsible for maintaining the VxVM namespace was unable to access the global device namespace. If configuration changes were recently made to VxVM diskgroups or volumes, this node may be unaware of those changes. Recently created volumes may be unaccessible from this node.

Solution:

Verify that the /global/.devices/node@N (N = this node's node number) is mounted globally and is accessible. If no configuration changes have been recently made to VxVM diskgroups or volumes and all volumes continue to be accessible from this node, then no further action is required. If changes have been made, the device namespace on this node can be updated to reflect those changes by executing '/usr/cluster/lib/dcs/scvxvmlg'. If the problem persists, contact your authorized Sun service provider to determine whether a workaround or patch is available.


273018 INTERNAL ERROR CMM: Failure starting CMM.

Description:

An instance of the userland CMM encountered an internal initialization error.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


273354 CMM: Node %s (nodeid = %d) is dead.

Description:

The specified node has died. It is guaranteed to be no longer running and it is safe to take over services from the dead node.

Solution:

The cause of the node failure should be resolved and the node should be rebooted if node failure is unexpected.


273638 The entry %s and entry %s in property %s have the same port number: %d.

Description:

The two entries in the list property duplicate port number.

Solution:

Remove one of the entries or change its port number.


274386 reservation error(%s) - Could not determine controller number for device %s

Description:

The device fencing program has suffered an internal error.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available. Copies of /var/adm/messages from all nodes should be provided for diagnosis. It may be possible to retry the failed operation, depending on the nature of the error. If the message specifies the 'node_join' transition, then this node may be unable to access shared devices. If the failure occurred during the 'release_shared_scsi2' transition, then a node which was joining the cluster may be unable to access shared devices. In either case, it may be possible to reacquire access to shared devices by executing '/usr/cluster/lib/sc/run_reserve -c node_join' on all cluster nodes. If the failure occurred during the 'make_primary' transition, then a device group has failed to start on this node. If another node was available to host the device group, then it should have been started on that node. If desired, it may be possible to switch the device group to this node with the scswitch command. If no other node was available, then the device group will not have been started. The scswitch command may be used to retry the attempt to start the device group. If the failure occurred during the 'primary_to_secondary' transition, then the shutdown or switchover of a device group has failed. The desired action may be retried.


274421 Port %d%c%s is listed twice in property %s, at entries %d and %d.

Description:

The port number in the message was listed twice in the named property, at the list entry locations given in the message. A port number should only appear once in the property.

Solution:

Specify the property with only one occurrence of the port number.


274506 Wrong data format from kstat: Expecting %d, Got %d.

Description:

See 176151

Solution:

See 176151


274603 No interfaces found

Description:

Solution:


274605 Server is online.

Description:

Informational message. Oracle server is online.

Solution:

None


274887 clcomm: solaris xdoor: rejected invo: door_return returned, errno = %d

Description:

An unusual but harmless event occurred. System operations continue unaffected.

Solution:

No user action is required.


274901 Invalid protocol %s given as part of property %s.

Description:

The property named does not have a legal value.

Solution:

Assign the property a legal value.


275999 TRANSPORT: rsm_get_controller failed, dev %s %d is not available


276380 "pmfadm -k": Error signaling <%s>: %s

Description:

An error occured while rpc.pmfd attempted to send a signal to one of the processes of the given tag. The reason for the failure is also given. The signal was sent as a result of a 'pmfadm -k' command.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


276672 reservation error(%s) - did_get_did_path() error

Description:

The device fencing program has suffered an internal error.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available. Copies of /var/adm/messages from all nodes should be provided for diagnosis. It may be possible to retry the failed operation, depending on the nature of the error. If the message specifies the 'node_join' transition, then this node may be unable to access shared devices. If the failure occurred during the 'release_shared_scsi2' transition, then a node which was joining the cluster may be unable to access shared devices. In either case, it may be possible to reacquire access to shared devices by executing '/usr/cluster/lib/sc/run_reserve -c node_join' on all cluster nodes. If the failure occurred during the 'make_primary' transition, then a device group has failed to start on this node. If another node was available to host the device group, then it should have been started on that node. If desired, it may be possible to switch the device group to this node with the scswitch command. If no other node was available, then the device group will not have been started. The scswitch command may be used to retry the attempt to start the device group. If the failure occurred during the 'primary_to_secondary' transition, then the shutdown or switchover of a device group has failed. The desired action may be retried.


277995 (%s) msg of wrong version %d, expected %d

Description:

Expected to receiver a message of a different version. udlmctl will fail.

Solution:

Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.


278240 Stopping fault monitor using pmf tag %s.

Description:

The fault monitor will be stopped usingthe Process Monitoring Facility (PMF), with the tagindicated in the message.

Solution:

This is an information message, no user action is needed.


278654 Unable to determine password for broker %s.

Description:

Cannot retrieve the password for the broker.

Solution:

Check that the scs1mqconfig file is accessible and correctly specifies the password.


279035 Unparsed sysevent received

Description:

Solution:


279084 CMM: node reconfiguration

Description:

The cluster membership monitor has processed a change in node or quorum status.

Solution:

This is an informational message, no user action is needed.


279152 listener %s probe successful.

Description:

Informational message. Listener monitor successfully completed first probe.

Solution:

None


279309 Failfast: Invalid failfast mode %s specified. Returning default mode PANIC.

Description:

An invalid value was supplied for the failfast mode. The software will use the default PANIC mode instead.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


280108 clcomm: unable to rebind %s to name server

Description:

The name server would not rebind this entity.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


280256 clnt_tp_create failed: %s

Description:

A client was not able to make an rpc connection to a server (rpc.pmfd, rpc.fed or rgmd) because it could not create the rpc handle. The rpc error is shown. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


280962 No ROOT HOST CONFIGURED.

Description:

The ROOT HOST is not configured in bv1to1.conf file.

Solution:

Re configure the BV site properly with proper ROOT HOST.


281386 dl_attach: DL_OK_ACK rtnd prim %u

Description:

Wrong primitive returned to the DL_ATTACH_REQ.

Solution:

Reboot the node. If the problem persists, check the documentation for the private interconnect.


281428 Failed to retrieve the resource group handle: %s.

Description:

An API operation on the resource group has failed.

Solution:

For the resource group name, check the syslog tag. For more details, check the syslog messages from other components. If the error persists, reboot the node.


281680 fatal: couldn't initialize ORB, possibly because machine is booted in non-cluster mode

Description:

The rgmd was unable to initialize its interface to the low-level cluster machinery. This might occur because the operator has attempted to start the rgmd on a node that is booted in non-cluster mode. The rgmd will produce a core file, and in some cases it might cause the node to halt or reboot to avoid data corruption.

Solution:

If the node is in non-cluster mode, boot it into cluster mode before attempting to start the rgmd. If the node is already in cluster mode, save a copy of the /var/adm/messages files on all nodes, and of the rgmd core file. Contact your authorized Sun service provider for assistance in diagnosing the problem.


281819 %s exited with error %s in step %s

Description:

A ucmm step execution failed in the indicated step.

Solution:

None. See /var/adm/messages for previous errors and report this problem if it occurs again during the next reconfiguration.


282406 fork1 returned %d. Exiting.

Description:

clexecd program has encountered a failed fork1(2) system call. The error message indicates the error number for the failure.

Solution:

If the error number is 12 (ENOMEM), install more memory, increase swap space, or reduce peak memory consumption. If error number is something else, contact your authorized Sun service provider to determine whether a workaround or patch is available.


282508 INTERNAL ERROR: r_state_at_least: state <%s> (%d)

Description:

A non-fatal internal error has occurred in the rgmd state machine.

Solution:

Since this problem might indicate an internal logic error in the rgmd, please save a copy of the /var/adm/messages files on all nodes, the output of an scstat -g command, and the output of a scrgadm -pvv command. Report the problem to your authorized Sun service provider.


282828 reservation warning(%s) - MHIOCRELEASE error will retry in %d seconds

Description:

The device fencing program has encountered errors while trying to access a device. The failed operation will be retried

Solution:

This is an informational message, no user action is needed.


282980 liveCache %s was brought down outside of Sun Cluster. Sun Cluster will suspend monitoring for it until it is started up successfully again by the user.

Description:

LiveCache fault monitor detects that liveCache was brought down by user intendedly outside of Sun Cluster. Suu Cluster will not take any action upon it until liveCache is started up successfully again by the user.

Solution:

No action is needed if the shutdown is intended. If not, start up liveCache again using LC10 or lcinit, so it can be under the monitoring of Sun Cluster.


283262 HA: rm_state_machine::service_suicide() not yet implemented

Description:

Unimplemented feature was activated.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


283767 network is very slow

Description:

This means that the PNM daemon was not able to read data from the network - either the network is very slow or the resources on the node are dangerously low.

Solution:

It is best to restart the PNM daemon. Send KILL (9) signal to pnmd. PMF will restart pnmd automatically. If the problem persists, restart the node with scswitch -S and shutdown(1M).


284006 reservation fatal error(UNKNOWN) - Out of memory

Description:

The device fencing program has been unable to allocate required memory.

Solution:

Memory usage should be monitored on this node and steps taken to provide more available memory if problems persist. Once memory has been made available, the following steps may need to taken: If the message specifies the 'node_join' transition, then this node may be unable to access shared devices. If the failure occurred during the 'release_shared_scsi2' transition, then a node which was joining the cluster may be unable to access shared devices. In either case, access to shared devices can be reacquired by executing '/usr/cluster/lib/sc/run_reserve -c node_join' on all cluster nodes. If the failure occurred during the 'make_primary' transition, then a device group has failed to start on this node. If another node was available to host the device group, then it should have been started on that node. The device group can be switched back to this node if desired by using the scswitch command. If no other node was available, then the device group will not have been started. The scswitch command may be used to start the device group. If the failure occurred during the 'primary_to_secondary' transition, then the shutdown or switchover of a device group has failed. The desired action may be retried.


284560 Failed to offload resource group %s: %s

Description:

An attempt to offload the specified resource group failed. The reason for the failure is logged.

Solution:

Look for the message indicating the reason for this failure. This should help in the diagnosis of the problem. Otherwise, save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


284635 BV Update Completed successfully.

Description:

Just an informational message that the update method method completedsuccessfully

Solution:

No action needed.


284644 Warning: node %d has a weight assigned to it for property %s, but node %d is not in the %s for resource %s.

Description:

A node has a weight assigned but the resource can never be active on that node, therefore it doesn't make sense to assign that node a weight.

Solution:

This is an informational message, no user action is needed. Optionally, the weight that is assigned to the node can be omitted.


284702 Error parsing URI: %s (%s)

Description:

There was an error parsing the URI for the reason given.

Solution:

Fix the syntax of the URI.


286722 scvxvmlg error - remove(%s) failed

Description:

The program responsible for maintaining the VxVM namespace was unable to access the global device namespace. If configuration changes were recently made to VxVM diskgroups or volumes, this node may be unaware of those changes. Recently created volumes may be unaccessible from this node.

Solution:

Verify that the /global/.devices/node@N (N = this node's node number) is mounted globally and is accessible. If no configuration changes have been recently made to VxVM diskgroups or volumes and all volumes continue to be accessible from this node, then no further action is required. If changes have been made, the device namespace on this node can be updated to reflect those changes by executing '/usr/cluster/lib/dcs/scvxvmlg'. If the problem persists, contact your authorized Sun service provider to determine whether a workaround or patch is available.


286807 clnt_tp_create_timed of program %s failed %s.

Description:

HA-NFS fault monitor was not able to make an rpc connection to an nfs server.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


287657 Failed to open <%s>: %s.


288900 internal error: invalid reply code


289194 Can't perform failover: Failover mode set to NONE.

Description:

Cannot perform failover of the data service. Failover mode is set to NONE.

Solution:

This is informational message. If failover is desired, then set the Failover_mode value to SOFT or HARD using scrgadm(1M).


289503 Unable to re-compute NFS resource list.

Description:

The list of HA-NFS resources online on the node has gotten corrupted.

Solution:

Make sure there is space available in /tmp. If the error is showing up despite that, reboot the node.


290644 Started sap processes under PMF successfully.

Description:

Informational message. SAP is being started under the control of Process Monitoring Facility (PMF).

Solution:

No action needed.


290735 Conversion of hostnames failed.

Description:

Data service is unable to convert the specified hostname into an IP address.

Solution:

Check the syslog messages that occurred just before, to check whether there is any internal error. If there is, then contact your authorized Sun service provider. Otherwise, if the logical host and shared address entries are specified in the /etc/inet/hosts file, check these entries are correct. If this is not the reason then check the health of the name server.


290926 Successful validation.

Description:

The validation of the configuration for the data service was successful.

Solution:

None. This is only an informational message.


291077 Invalid variable name in the Environment file %s. Ignoring %s HA-Oracle reads the fle specified in USER_ENV property and exports the variables declared in the file. Syntax for declaring the variables is : VARIABLE=VALUE Lines starting with ' VARIABLE is expected to be a valid Korn shell variable that starts with alphabet or '_' and contains alphanumerics and '_'.

Solution:

Please check the environment file and correct the syntax errors. Do not use export statement in environment file.


291245 Invalid type %d passed.

Description:

An invalid value was passed for the program_type argument in the pmf routines.

Solution:

This is a programming error. Verify the value specified for program_type argument and correct it. The valid types are: SCDS_PMF_TYPE_SVC: data service application SCDS_PMF_TYPE_MON: fault monitor SCDS_PMF_TYPE_OTHER: other


291378 No Database probe script specified. Will Assume the Database is Running

Description:

This is a informational messages. probing of the URL's set in the Server_url or the Monitor_uri_list failed. Before taking any action the WLS probe would make sure the DB is up (if a db_probe_script extension property is set). But, the Database probe script is not specified. The probe will assume that the DB is UP and will go ahead and take action on the WLS.

Solution:

Make sure the DB is UP.


291716 Error while executing siebenv.sh.

Description:

There was an error while attempting to execute (source) the specified file. This may be due to improper permissions, or improper settings in this file.

Solution:

Please verify that the file has correct permissions. If permissions are correct, verify all the settings in this file. Try to manually source this file in korn shell ('. siebenv.sh'), and correct any errors.


291986 dl_bind ack bad len %d

Description:

Sanity check. The message length in the acknowledgment to the bind request is different from what was expected. We are trying to open a fast path to the private transport adapters.

Solution:

Reboot of the node might fix the problem.


292013 clcomm: UioBuf: uio was too fragmented - %d

Description:

The system attempted to use a uio that had more than DEF_IOV_MAX fragments.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


295366 Unable to mark the interface %s%d down, rc %d

Description:

Topology Manager is done with using the adapter and failed when tried to mark the interface down


295666 clcomm: setrlimit(RLIMIT_NOFILE): %s

Description:

During cluster initialization within this user process, the setrlimit call failed with the specified error.

Solution:

Read the man page for setrlimit for a more detailed description of the error.


295838 Listener %s started.

Description:

Informational message. HA-Oracle successfully started Oracle listener.

Solution:

None


296046 INTERNAL ERROR: resource group <%s> is PENDING_BOOT or ERROR_STOP_FAILED or ON_PENDING_R_RESTART, but contains no resources

Description:

The operator is attempting to delete the indicated resource group. Although the group is empty of resources, it was found to be in an unexpected state. This will cause the resource group deletion to fail.

Solution:

Use scswitch(1M) -z to switch the resource group offline on all nodes, then retry the deletion operation. Since this problem might indicate an internal logic error in the rgmd, please save a copy of the /var/adm/messages files on all nodes, and report the problem to your authorized Sun service provider.


297061 clcomm: can't get new reference

Description:

An attempt was made to obtain a new reference on a revoked handler.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


297139 CCR: More than one data server has override flag set for the table %s. Using the table from node %s.

Description:

The override flag for a table indicates that the CCR should use this copy as the final version when the cluster is coming up. In this case, the CCR detected multiple nodes having the override flag set for the indicated table. It chose the copy on the indicated node as the final version.

Solution:

This is an informational message, no user action is needed.


297178 Error opening procfs control file <%s> for tag <%s>: %s

Description:

The rpc.pmfd server was not able to open a procfs control file, and the system error is shown. procfs control files are required in order to monitor user processes.

Solution:

Investigate if the machine is running out of memory. If this is not the case, save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


297325 The node portion of %s at position %d in property %s is not a valid node identifier or node name.

Description:

An invalid node was specified for the named property. The position index, which starts at 0 for the first element in the list, indicates which element in the property list was invalid.

Solution:

Specify a valid node instead.


297536 Could not host device service %s because this node is being shut down

Description:

An attempt was made to start a device group on this node while the node was being shutdown.

Solution:

If the node was not being shutdown during this time, or if the problem persists, please contact your authorized Sun service provider to determine whether a workaround or patch is available.


297867 (%s) t_bind: tli error: %s

Description:

Call to t_bind() failed. The "t_bind" man page describes possible error codes. udlmctl will exit.

Solution:

Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.


298060 Invalid project name: %s

Description:

Either the given project name doesn't exist, or root is not a valid user of the given project.

Solution:

Check if the project name is valid and root is a valid user of that project.


298320 Command %s is too long.

Description:

The command string passed to the function is too long.

Solution:

Use a shorter command name or shorter path to the command.


298532 Script lccluster does not exist.

Description:

Script 'lccluster' is not found under /sapdb/<livecache_Name>/db/sap/.

Solution:

Follow the HA-liveCache installation guide to create 'lccluster'.


298719 getrlimit: %s

Description:

The rpc.pmfd or rpc.fed server was not able to get the limit of files open. The message contains the system error. This happens while the server is starting up, at boot time. The server does not come up, and an error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


298911 setrlimit: %s

Description:

The rpc.pmfd server was not able to set the limit of files open. The message contains the system error. This happens while the server is starting up, at boot time. The server does not come up, and an error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


299417 in libsecurity strong Unix authorization failed

Description:

A server (rgmd) refused an rpc connection from a client because it failed the Unix authentication. This happens if a caller program using scha public api, either in its C form or its CLI form, is not running as root or is not making the rpc call over the loopback interface. An error message is output to syslog.

Solution:

Check that the calling program using the scha public api is running as root and is calling over the loopback interface. If both are correct, save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


299639 Failed to retrieve the property %s: %s. Will shutdown the WLS using sigkill

Description:

This is an internal error. The property could not be retrieved. The WLS stop method however will go ahead and kill the WLS process.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.

Message IDs 300000–399999


300397 resource %s property changed.

Description:

This is a notification from the rgmd that a resource's property has been edited by the cluster administrator. This may be used by system monitoring tools.

Solution:

This is an informational message, no user action is needed.


300777 reservation warning(%s) - Unable to open device %s, errno %d, will retry in %d seconds

Description:

The device fencing program has encountered errors while trying to access a device. The failed operation will be retried

Solution:

This is an informational message, no user action is needed.


301092 file specified in ENVIRONMENT_FILE parameter %s does not exist.

Description:

The 'Environment_File' property was set when configuring theresource. The file specified by the 'Environment_File' property may not exist. The file should be readable and specified with a fully qualifiedpath.

Solution:

Specify an existing file with a fully qualified file name whencreating a resource.


301573 clcomm: error in copyin for cl_change_flow_settings

Description:

The system failed a copy operation supporting a flow control state change.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


301603 fatal: cannot create any threads to handle switchback

Description:

The rgmd was unable to create a sufficient number of threads upon starting up. This is a fatal error. The rgmd will produce a core file and will force the node to halt or reboot to avoid the possibility of data corruption.

Solution:

Make sure that the hardware configuration meets documented minimum requirements. Save a copy of the /var/adm/messages files on all nodes, and of the rgmd core file. Contact your authorized Sun service provider for assistance in diagnosing the problem.


301635 clexecd: close returned %d. Exiting.

Description:

clexecd program has encountered a failed thr_create(3THR) system call. The error message indicates the error number for the failure.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


301884 This node is running software incompatible with the rest of the cluster and will shut down.

Description:

The cluster version manager exchanges version information between nodes running in the cluster and has detected an incompatibility. This is usually the result of performing a rolling upgrade where one or more nodes has been installed with a software version that the other cluster nodes do not support. This error may also be due to attempting to boot a cluster node in 64-bit address mode when other nodes are booted in 32-bit address mode, or vice versa.

Solution:

Verify that any recent software installations completed without errors and that the installed packages or patches are compatible with the rest of the installed software. Save the /var/adm/messages file. Check the messages file for earlier messages related to the version manager which may indicate which software component is failing to find a compatible version.


302670 udlm_setup_port: fcntl: %d

Description:

A server was not able to execute fnctl(). udlm exits and the node aborts and panics.

Solution:

This is an internal error. Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.


303231 mount_client_impl::remove_client() failed attempted RM change_repl_prov_status() to remove client, spec %s, name %s

Description:

The system was unable to remove a PXFS replica on the node that this message was seen.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


303434 Warning: could not validate the settings in <%s>. It is recommended that the settings for host lookup consult "files" before a name server


303805 Cannot change the IPMP group on the local host.

Description:

A different IPMP group for property NetIfList is specified in scrgadm command. The IPMP group on local node is set at resource creation time. Users may only update the value of property NetIfList for adding a IPMP group on a new node.

Solution:

Rerun the scrgadm command with proper value of property NetIfList.


303879 INTERNAL ERROR: Unable to lock %s: %s.

Description:

An internal error has occurred.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider forassistance in diagnosing the problem.


303941 Unsuccessful probe of %s port %d for non-secure resource %s. (%s)

Description:

An error occurred while fault monitor attempted to probe the health of the data service.

Solution:

Wait for the fault monitor to correct this by doing restart or failover. For more error description, look at the syslog messages.


304365 clcomm: Could not create any threads for pool %d

Description:

The system creates server threads to support requests from other nodes in the cluster. The system could not create any server threads during system startup. This is caused by a lack of memory.

Solution:

There are two solutions. Install more memory. Alternatively, take steps to reduce memory usage. Since the creation of server threads takes place during system startup, application memory usage is normally not a factor.


305298 cm_callback_impl abort_trans: exiting

Description:

ucmm callback for abort transition failed.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


306407 Failed to stop Adaptive server.

Description:

Sun Cluster HA for Sybase failed to stop using KILL signal.

Solution:

Please examine whether any Sybase server processes are running on the server. Please manually shutdown the server.


307195 clcomm: error in copyin for cl_read_flow_settings

Description:

The system failed a copy operation supporting flow control state reporting.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


308800 ERROR: rebalance: <%s> is pending_methods on node <%d>

Description:

An internal error has occurred in the locking logic of the rgmd. This error should not occur. It may prevent the rgmd from bringing the indicated resource group online.

Solution:

Look for other syslog error messages on the same node. Save a copy of the /var/adm/messages files on all nodes, and report the problem to your authorized Sun service provider.


309875:Error encountered enabling failfast.

Description:

An error occurred while attempting to enable the reservation failfast on the disks that are shared by other nodes.

Solution:

This is an internal error. Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log, and /var/cluster/ucmm/dlm*/logs/* from all the nodes and contact your Sun service representative.


310953 clnt_control of program %s failed %s.

Description:

HA-NFS fault monitor failed to reset the retry timeout for retransmitting the rpc request.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


311463 Failover attempt failed.

Description:

The failover attempt of the resource is rejected or encountered an error.

Solution:

For more detailed error message, check the syslog messages. Check whether the Pingpong_interval has appropriate value. If not, adjust it using scrgadm(1M). Otherwise, use scswitch to switch the resource group to a healthy node.


311808 Can not open /etc/mnttab: %s

Description:

Error in open /etc/mnttab, the error message is followed.

Solution:

Check with system administrator and make sure /etc/mnttab is properly defined.


312004 Value of Start_timeout property may be small for %d max_offload_retry for %d resource groups to offload.

Description:

This is a warning message indicating the you may have set max_offload_retry to a high value which may cause the start method of RGOffload resource to timeout before an attempt can be made to offload all specified resource groups.

Solution:

Please calculate the max_offload_retry so that the Start_timeout is not exceeded if every resource group that has to be offloaded requires maximum retries. There is a 10 second interval between successive retries.


312053 Cannot execute %s: %s.

Description:

Failure in executing the command.

Solution:

Check the syslog message for the command description. Check whether the system is low in memory or the process table is full and take appropriate action. Make sure that the executable exists.


313510 SAP xserver is not available.

Description:

SAP xserver is not running currently.

Solution:

Informative message, no action is required.


313806 pm_tick delay of %lld ms exceeds %lld ms


313867 Unknown step: %s

Description:

Request to run an unknown udlm step.

Solution:

None.


314314 prog <%s> step <%s> terminated due to receipt of signal <%d>

Description:

ucmmd step terminated due to receipt of a signal.

Solution:

Examine other syslog messages occurring at about the same time to see if the problem can be identified and if it recurs. Save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance.


314341 Invalid probe values. Retry_interval (currently set to %d) must be greater than or equal to the product of Thorough_probe_interval (currently set to % d), and Retry_count (currently set to %d).

Description:

Validation of the probe related parameters failed because invalid values were specified.

Solution:

Retry_interval must be greater than or equal to the product of Thorough_probe_interval, and Retry_count. Use scrgadm(1M) to modify the values of these parameters so that they will hold the above relationship.


314356 resource %s enabled.

Description:

This is a notification from the rgmd that the operator has enabled a resource. This may be used by system monitoring tools.

Solution:

This is an informational message, no user action is needed.


314358 Command %s failed to complete. Return code is %d.

Description:

The listed command failed to complete with the listed return code. The return code is from the script db_clear.

Solution:

Look for other syslog error messages on the same node. Save a copy of the /var/adm/messages files on all nodes, and report the problem to your authorized Sun service provider.


315446 node id <%d> is out of range

Description:

The low-level cluster machinery has encountered an error.

Solution:

Look for other syslog messages occurring just before or after this one on the same node; they may provide a description of the error.


316215 Process sapsocol is already running outside of Sun Cluster. Will terminate it now, and restart it under Sun Cluster.

Description:

The SAP OS collector process is running outside of the control of Sun Cluster. HA-SAP will terminate it and restart it under the control of Sun Cluster.

Solution:

Informational message. No user action needed.


317263 Unable to retrieve the cluster handle: %s.


318636 No executable $BV1TO1/bin/xbvconf

Description:

The specified executable is not foundAction:Check if the Broadvision software was installed properly. Make sure the specified executable is available at theright location.


319047 CMM: Issuing a SCSI2 Tkown failed for quorum device %s with error %d.

Description:

This node encountered the specified error while issuing a SCSI2 Tkown operation on the indicated quorum device. This will cause the node to conclude that it has been unsuccessful in preempting keys from the quorum device, and therefore the partition to which it belongs has been preempted. If a cluster gets divided into two or more disjoint subclusters, exactly one of these must survive as the operational cluster. The surviving cluster forces the other subclusters to abort by grabbing enough votes to grant it majority quorum. This is referred to as preemption of the losing subclusters.

Solution:

If the error encountered is EACCES, then the SCSI2 command could have failed due to the presence of SCSI3 keys on the quorum device. Scrub the SCSI3 keys off of it, and reboot the preempted nodes.


319048 CCR: Cluster has lost quorum while updating table %s, it is possibly in an inconsistent state - ABORTING.

Description:

The cluster lost quorum while the indicated table was being changed, leading to potential inconsistent copies on the nodes.

Solution:

Check if the indicated table are consistent on all the nodes in the cluster, if not, boot the cluster in -x mode to restore the indicated table from backup. The CCR tables are located at /etc/cluster/ccr/.


319261 liveCache %s failed to start.

Description:

liveCache started up with error.

Solution:

Sun Cluster will fail over the liveCache resource to another available node. No user action is needed.


319375 clexecd: wait_for_signals got NULL.

Description:

clexecd problem encountered an error.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


319413 Siebel server can be started only after Siebel database and Siebel gateway are running, and the scsblconfig file is correctly configured.

Description:

This is a warning message indicating a problem in determining the status of Siebel database and/or the Siebel gateway.

Solution:

Please verify that the scsblconfig file is correctly configured, and that the Siebel database and Siebel gateway are up before attempting to start the Siebel server.


320378 INTERNAL ERROR: usage: $0 <server_root> <siebel_enterprise>

Description:

An internal error has occurred.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


320833 INTERNAL ERROR usage:$0 BVUSER BV1TO1_VAR bv_local_host action IT_CONNECT_ATTEMPTS BV_ORB_CONNECT_TIMEOUT

Description:

An internal error has occurred.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider forassistance in diagnosing the problem.


321245 resource <%s> is disabled but not offline

Description:

While attempting to execute an operator-requested enable or disable of a resource, the rgmd has found the indicated resource to have its Onoff_switch property set to DISABLED, yet the resource is not offline. This suggests corruption of the RGM's internal data and will cause the enable or disable action to fail.

Solution:

This may indicate an internal error or bug in the rgmd. Contact your authorized Sun service provider for assistance in diagnosing and correcting the problem.


321667 clcomm: cl_comm: not booted in cluster mode.

Description:

Attempted to load the cl_comm module when the node was not booted as part of a cluster.

Solution:

Users should not explicitly load this module.


321962 Command %s failed to complete. HA-SAP will continue to start SAP.

Description:

The command to cleanipc failed to complete.

Solution:

This is an internal error. No user action needed. Save the /var/adm/messages from all nodes. Contact your authorized Sun service provider.


322373 Unable to unplumb %s%d, rc %d

Description:

Topology Manage is done with using the interface and failed to unplumb the adapter.


322642 Error binding to %s port %d for non-secure resource %s: %s (%s)

Description:

An error occurred while fault monitor attempted to probe the health of the data service.

Solution:

Wait for the fault monitor to correct this by doing restart or failover. For more error description, look at the syslog messages.


322675 Some NFS system daemons are not running.

Description:

HA-NFS fault monitor checks the health of statd, lockd, mountd and nfsd daemons on the node. It detected that one or more of these are not currently running.

Solution:

No action. The monitor would restart these. If it doesn't, reboot the node.


322797 Error registering provider '%s' with the framework.

Description:

The device configuration system on this node has suffered an internal error.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


322862 clcomm: error in copyin for cl_read_threads_min

Description:

The system failed a copy operation supporting flow control state reporting.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


322879 clcomm: Invalid copyargs: node %d pid %d

Description:

The system does not support copy operations between the kernel and a user process when the specified node is not the local node.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


322908 CMM: Failed to join the cluster: error = %d.

Description:

The local node was unsuccessful in joining the cluster.

Solution:

There may be other related messages on this node that may indicate the cause of this failure. Resolve the problem and reboot the node.


323498 libsecurity: NULL RPC to program %ld failed will retry %s

Description:

A client of the rpc.pmfd, rpc.fed or rgmd server was not able to initiate an rpc connection, because it could not execute a test rpc call, and the program will retry to establish the connection. The message shows the specific rpc error. The program number is shown. To find out what program corresponds to this number, use the rpcinfo command. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


324478 (%s): Error %d from read

Description:

An error was encountered in the clexecd program while reading the data from the worker process.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


325322 clcomm: error in copyin for state_resource_pool

Description:

The system failed a copy operation supporting statistics reporting.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


326043 reservation fatal error(%s) - release_resv_lock() returned exception

Description:

The device fencing program has suffered an internal error.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available. Copies of /var/adm/messages from all nodes should be provided for diagnosis. It may be possible to retry the failed operation, depending on the nature of the error. If the message specifies the 'node_join' transition, then this node may be unable to access shared devices. If the failure occurred during the 'release_shared_scsi2' transition, then a node which was joining the cluster may be unable to access shared devices. In either case, it may be possible to reacquire access to shared devices by executing '/usr/cluster/lib/sc/run_reserve -c node_join' on all cluster nodes. If the failure occurred during the 'make_primary' transition, then a device group has failed to start on this node. If another node was available to host the device group, then it should have been started on that node. If desired, it may be possible to switch the device group to this node with the scswitch command. If no other node was available, then the device group will not have been started. The scswitch command may be used to retry the attempt to start the device group. If the failure occurred during the 'primary_to_secondary' transition, then the shutdown or switchover of a device group has failed. The desired action may be retried.


326528 Nonexistent Broker_name (%s).

Description:

The broker name provided in the extension property Broker_Name does not exist.

Solution:

Check that a broker instance exists for the supplied broker name. It should match the broker name portion of the path in Confdir_list.


327057 SharedAddress stopped.

Description:

The stop method is completed and the resource is stopped.

Solution:

This is informational message. No user action required.


329286 (%s) instead of UDLM_ACK got a %d

Description:

Did not receive an acknowledgment from udlm as was expected.

Solution:

None.


329429 reservation fatal error(%s) - host_name not specified

Description:

The device fencing program has suffered an internal error.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available. Copies of /var/adm/messages from all nodes should be provided for diagnosis. It may be possible to retry the failed operation, depending on the nature of the error. If the message specifies the 'node_join' transition, then this node may be unable to access shared devices. If the failure occurred during the 'release_shared_scsi2' transition, then a node which was joining the cluster may be unable to access shared devices. In either case, it may be possible to reacquire access to shared devices by executing '/usr/cluster/lib/sc/run_reserve -c node_join' on all cluster nodes. If the failure occurred during the 'make_primary' transition, then a device group has failed to start on this node. If another node was available to host the device group, then it should have been started on that node. If desired, it may be possible to switch the device group to this node with the scswitch command. If no other node was available, then the device group will not have been started. The scswitch command may be used to retry the attempt to start the device group. If the failure occurred during the 'primary_to_secondary' transition, then the shutdown or switchover of a device group has failed. The desired action may be retried.


329496 unlatch_intention(): IDL exception when communicating to node %d

Description:

An inter-node communication failed, probably because a node died.

Solution:

No action is required; the rgmd should recover automatically.


329616 svc_probe used entire timeout of %d seconds during connect operation and exceeded the timeout by %d seconds. Attempting disconnect with timeout %d

Description:

The probe timed out connecting to the application.

Solution:

If the problem persists investigate why the application is responding slowly or if the Probe_timeout property needs to be increased.


329778 clconf: Data length is more than max supported length in clconf_ccr read

Description:

In reading configuration data through CCR, found the data length is more than max supported length.

Solution:

Check the CCR configuration information.


329847 Warning: node %d has a weight of 0 assigned to it for property %s.

Description:

The named node has a weight of 0 assigned to it. A weight of 0 means that no new client connections will be distributed to that node.

Solution:

Consider assigning the named node a non-zero weight.


329957 odd table entry


330063 error in vop open %x

Description:

Opening a private interconnect interface failed.

Solution:

Reboot of the node might fix the problem.


330182 Internal error: default value missing for resource property

Description:

A non-fatal internal error has occurred in the RGM.

Solution:

Since this problem might indicate an internal logic error in the rgmd, please save a copy of the /var/adm/messages files on all nodes, the output of an scstat -g command, and the output of a scrgadm -pvv command. Report the problem to your authorized Sun service provider.


330526 CMM: Number of steps specified in registering callback = %d; should be <= %d.

Description:

The number of steps specified during registering a CMM callback exceeds the allowable maximum. This is an internal error.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


331221 CMM: Max detection delay specified is %ld which is larger than the max allowed %ld.

Description:

The maximum of the node down detection delays is larger than the allowable maximum. The maximum allowed will be used as the actual maximum in this case.

Solution:

This is an informational message, no user action is needed.


331325 sigprocmask: %s The rpc.fed server encountered an error with the sigprocmask function, and was not able to start. The message contains the system error.

Solution:

Save the syslog messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


333069 Failed to retrieve nodeid for %s.

Description:

The nodeid for the given name could not be determined.

Solution:

Make sure that the name given is a valid node identifier or node name.


333455 Multi-IP group '%s' removed

Description:

The Multi-IP group by that name is removed.

Solution:

This is an informational message, no user action is needed.


333928 LogicalHostname offline.


334697 Failed to retrieve the cluster property %s: %s.

Description:

The query for a property failed. The reason for the failure is given in the message.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


334992 clutil: Adding deferred task after threadpool shutdown id %s

Description:

During shutdown this operation is not allowed.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


335206 Failed to get host names from the resource.

Description:

Retrieving the IP addresses from the network resources from this resource group has failed.

Solution:

Internal error or API call failure might be the reasons. Check the error messages that occurred just before this message. If there is internal error, contact your authorized Sun service provider. For API call failure, check the syslog messages from other components. For the resource name and resource group name, check the syslog tag.


335591 Failed to retrieve the resource group property %s: %s.

Description:

An API operation has failed while retrieving the resource group property. Low memory or API call failure might be the reasons.

Solution:

In case of low memory, the problem will probably cured by rebooting. If the problem reoccurs, you might need to increase swap space by configuring additional swap devices. Otherwise, if it is API call failure, check the syslog messages from other components. For resource group name and the property name, check the current syslog message.


335468 Time allocated to stop development system is too small (less than 5 seconds).

Description:

Time allocated to stop the development system is too small.

Solution:

The time for stopping the development system is a percentage of the total Start_timeout. Increase the value for property Start_timeout or the value for propety Dev_stop_pct.


336860 read %d for %snum_ports

Description:

Could not get information about the number of ports udlm uses from config file udlm.conf.

Solution:

Check to make sure udlm.conf file exist and has entry for udlm.num_ports. If everything looks normal and the problem persists, contact your Sun service representative.


337008 rgm_comm_impl::_unreferenced() called unexpectedly

Description:

The low-level cluster machinery has encountered a fatal error. The rgmd will produce a core file and will cause the node to halt or reboot to avoid the possibility of data corruption.

Solution:

Save a copy of the /var/adm/messages files on all nodes, and of the rgmd core file. Contact your authorized Sun service provider for assistance in diagnosing the problem.


337073 $BV1TO1_VAR is not a directory.

Description:

The specified environment variable does not point to the right directory.

Solution:

Set the specified environment variable correctly.


337166 Error setting environment variable %s.

Description:

An error occured while setting the environment variable LD_LIBRARY_PATH. This is required by the fault monitor for the nsldap data service. The fault monitor appends the ldap server root path, including the lib directory, to the LD_LIBRARY_PATH environment variable

Solution:

Check that there is a lib directory under the server root of the nsldap data service which pertains to this resource. If this directory has been removed, then it must be replaced by reinstalling Netscape Directory Server, or whatever other means are appropriate.


337212 resource type %s removed.

Description:

This is a notification from the rgmd that a resource type has been deleted. This may be used by system monitoring tools.

Solution:

This is an informational message, no user action is needed.


338067 This resource does not depend on any SUNW.HAStoragePlus resources. Proceeding with normal checks.

Description:

The resource does not depend on any HAStoragePlus filesystems. The validation will continue with it's other checks.

Solution:

This message is informational; no user action is needed.


338839 clexecd: Could not create thread. Error: %d. Sleeping for %d seconds and retrying.

Description:

clexecd program has encountered a failed thr_create() system call. The error message indicates the error number for the failure. It will retry the system call after specified time.

Solution:

If the message is seen repeatedly, contact your authorized Sun service provider to determine whether a workaround or patch is available.


339424 Could not host device service %s because this node is being removed from the list of eligible nodes for this service.

Description:

A switchover/failover was attempted to a node that was being removed from the list of nodes that could host this device service.

Solution:

This is an informational message, no user action is needed.


339521 CCR: Lost quorum while starting to update table %s.

Description:

The cluster lost quorum when CCR started to update the indicated table.

Solution:

Reboot the cluster.


339590 Error (%s) when reading property %s.

Description:

Unable to read property value using API. Property name is indicated in message. Syslog messages may give more information on errors in other modules.

Solution:

Check syslog messages. Please report this problem.


339657 Issuing a restart request.

Description:

This is informational message. We are above to call API function to request for restart. In case of failure, follow the syslog messages after this message.

Solution:

No user action is needed.


339954 fatal: cannot create any threads to launch callback methods

Description:

The rgmd was unable to create a thread upon starting up. This is a fatal error. The rgmd will produce a core file and will force the node to halt or reboot to avoid the possibility of data corruption.

Solution:

Make sure that the hardware configuration meets documented minimum requirements. Save a copy of the /var/adm/messages files on all nodes, and of the rgmd core file. Contact your authorized Sun service provider for assistance in diagnosing the problem.


340287 idl_set_timestamp(): IDL Exception

Description:

The rgmd has encountered an error that prevents the scha_control function from successfully setting a ping-pong time stamp, presumably because a node died. This does not prevent the attempted failover from succeeding, but in the worst case, might prevent the anti-"pingpong" feature from working, which may permit a resource group to fail over repeatedly between two or more nodes.

Solution:

Examine syslog output on the node that rebooted, to determine the cause of node death. The syslog output might indicate further remedial actions.


340893 The stop command \'%s\' failed to stop %s. Using SIGKILL.

Description:

The specified stop command was unable to stop the specified resource. A SIGKILL signal will be sent to all the processes associated with the resource.

Solution:

No action required by the user. This is an informational message.


341502 Unable to plumb even after unplumbing. rc = %d

Description:

Topology Manager failed to plumb an adapter for private network. A possible reason for plumb to fail is that it is already plumbed. Solaris clustering has successfully unplumbed the adapter but failed while trying to plumb for private use.


341719 Restarting daemon %s.

Description:

HA-NFS is restarting the specified daemon.

Solution:

No action.


341754 INTERNAL ERROR: usage: $0 <logicalhost> <server_root> <siebel_enterprise> <siebel_servername> <timeout>

Description:

An internal error has occurred.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


341804 Failed to retrieve information for user %s.

Description:

Failed to retrieve information for the specified BV user

Solution:

Check if the proper Broadvision Unix User ID is set orcheck if this user exists on all the nodes of the cluster.


342113 Invalid script name %s. It cannot contain any '/'.

Description:

The script name should be just the script name, no path is needed.

Solution:

Specify just the script name without any path.


342336 clcomm: Pathend %p: path_down not allowed in state %d

Description:

The system maintains state information about a path. A path_down operation is not allowed in this state.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


342597 sigaddset: %s The rpc.fed server encountered an error with the sigemptyset function, and was not able to start. The message contains the system error.

Solution:

Save the syslog messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


342793 Successfully started the service %s.

Description:

Specified data service started successfully.

Solution:

None. This is only an informational message.


343307 Could not open file %s: %s.

Description:

System has failed to open the specified file.

Solution:

Check whether the permissions are valid. This might be the result of a lack of system resources. Check whether the system is low in memory and take appropriate action. For specific error information, check the syslog message.


345342 Failed to connect to %s port %d.

Description:

The data service fault monitor probe was trying to connect to the host and port specified and failed. There may be a prior message in syslog with further information.

Solution:

Make sure that the port configuration for the data service matches the port configuration for the underlying application.


346016 setproject: %s; continuing the process with system default project.

Description:

Either the given project name was invalid, or the caller of setproject() was not a valid user of the given project. The process was launched with project "default" instead of the specified project.

Solution:

Use the projects(1) command to check if the project name is valid and the caller is a valid user of the given project.


346036 libsecurity: unexpected getnetconfigent error

Description:

A client of the rpc.pmfd, rpc.fed or rgmd server was not able to initiate an rpc connection, because it could not get the network information. The pmfadm or scha command exits with error. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


347091 resource type %s added.

Description:

This is a notification from the rgmd that the operator has created a new resource type. This may be used by system monitoring tools.

Solution:

This is an informational message, no user action is needed.


347344 Did not find a valid port number to match field <%s> in configuration file <%s>: %s.

Description:

A failure occurred extracting a port number for the field within the configuration file. The field exists and the file exists and is accessible. The value for the field may not exist or may not be an integer greater than zero. An error in environment may have occurred, indicated by a non-zero errno value at the end of the message.

Solution:

Check to see if the value for the field in the configuration file exists and is an integer greater than zero. If there is an error in the field value, fix the value and retry the operation.


348240 clexecd: putmsg returned %d.

Description:

clexecd program has encountered a failed putmsg(2) system call. The error message indicates the error number for the failure.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


349049 CCR reported invalid table %s; halting node

Description:

The CCR reported to the rgmd that the CCR table specified is invalid or corrupted. The node will be halted to prevent further errors.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


349741 Command %s is not a regular file.

Description:

The specified pathname, which was passed to a libdsdev routine such as scds_timerun or scds_pmf_start, does not refer to a regular file. This could be the result of 1) mis-configuring the name of a START or MONITOR_START method or other property, 2) a programming error made by the resource type developer, or 3) a problem with the specified pathname in the file system itself.

Solution:

Ensure that the pathname refers to a regular, executable file.


351777 Resource is online again.


351887 Bulk registration failed

Description:

Solution:


353557 Filesystem (%s) is locked and cannot be frozen

Description:

The file system has been locked with the _FIOLFS ioctl. It is necessary to perform an unlock _FIOLFS ioctl. The growfs(1M) or lockfs(1M) command may be responsible for this lock.

Solution:

An _FIOLFS LOCKFS_ULOCK ioctl is required to unlock the file system.


353753 invalid mask in hosts list: %s

Description:

Solution:


354821 Attempting to start the fault monitor under process monitor facility.

Description:

The function is going to request the PMF to start the fault monitor. If the request fails, refer to the syslog messages that appear after this message.

Solution:

This is an informational message, no user action is required.


355950 HA: unknown invocation result status %d

Description:

An invocation completed with an invalid result status.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


356795 CMM: Reconfiguration step %d was forced to return.

Description:

One of the CMM reconfiguration step transitions failed, probably due to a problem on a remote node. A reconfiguration is forced assuming that the CMM will resolve the problem.

Solution:

This is an informational message, no user action is needed.


356930 Property %s is empty. This property must be specified for scalable resources.

Description:

The value of the specified property must not be empty for scalable resources.

Solution:

Use scrgadm(1M) to specify a non-empty value for the property.


357263 munmap: %s

Description:

The rpc.pmfd server was not able to delete shared memory for a semaphore, possibly due to low memory, and the system error is shown. This is part of the cleanup after a client call, so the operation might have gone through. An error message is output to syslog.

Solution:

Investigate if the machine is running out of memory. If this is not the case, save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


357558 %s: Unable to start in multi-threaded mode.

Description:

RPC service is unable to start the daemon in multithreaded mode.


357767 %d entries found in property %s. For a nonsecure %s instance %s should have exactly one entry.

Description:

Since a nonsecure Server instance only listens on a single port, the specified property should only have a single entry. A different number of entries was found.

Solution:

Change the number of entries to be exactly one.


357915 Error: Unable to stat directory <%s> for scha_control timestamp file

Description:

The rgmd failed in a call to stat(2) on the local node. This may prevent the anti-"pingpong" feature from working, which may permit a resource group to fail over repeatedly between two or more nodes. The failure of the stat call might indicate a more serious problem on the node.

Solution:

Examine other syslog messages occurring around the same time on the same node, to see if the source of the problem can be identified.


358129 Either extension property <failover_enabled> is not defined, or an error occured while retrieving this property; using the default value of TRUE.

Description:

Property failover_enabled may not be defined in RTR file. Continue the process with the default value of TRUE.

Solution:

This is an informational message, no user action is needed.


358211 monitor_check: the failover requested by scha_control for resource <%s>, resource group <%s> was not completed because of error: %s

Description:

A scha_control(1HA,3HA) GIVEOVER attempt failed, due to the error listed.

Solution:

Examine other syslog messages on all cluster members that occurred about the same time as this message, to see if the problem that caused the error can be identified and repaired.


358533 Invalid protocol is specified in %s property.

Description:

The specified system property does not have a valid protocol.

Solution:

Using scrgadm(1M), change the value of the property to use a valid protocol. For example: TCP, UDP.


359648 Service has failed.

Description:

Probe is detected a failure in the data service. The data service cannot be restarted on the same node, since there are frequent failures. Probe is setting resource status as failed.

Solution:

Wait for the fault monitor to failover the data service. Check the syslog messages and configuration of the data service.


360600:Oracle UDLM package wrong instruction set architecture.

Description:

The Oracle UDLM package that is currently installed is the incorrect instruction set architecture for the mode that the node is currently booted in, (e.g., Oracle UDLM is 64-bit (sparc9) and the node is currently boot in 32-bit mode (sparc)).

Solution:

Obtain and install the proper Oracle UDLM package from Oracle for the instruction set architecture of the system, or boot the node in an instruction set architecture that is compatible with the current version of the Oracle UDLM.


361048 ERROR: rgm_run_state() returned non-zero while running boot methods

Description:

The rgmd state machine has encountered an error on this node.

Solution:

Look for preceding syslog messages on the same node, which may provide a description of the error.


361289 iPlanet service with config file <%s> does not configure %s.

Description:

The magnus.conf configuration file for the iPlanet Web Server instance does not contain the specified directive.

Solution:

Edit the configuration file and set the specified directive.


361489 in libsecurity __rpc_negotiate_uid failed for transport %s

Description:

A server (rpc.pmfd, rpc.fed or rgmd) was not able to start because it could not establish a rpc connection for the network. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


361831 Initialization failed. Invalid command line %s %s.

Description:

Unable to process parameters passed to the call back method. The parameters are indicated in the message. This is a Sun Cluster HAfor Sybase internal error.

Solution:

Report this problem to your authorized Sun service provider


362463 clcomm: Endpoint %p: path_down not allowed in state %d

Description:

The system maintains information about the state of an Endpoint. The path_down operation is not allowed in this state.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


362519 dl_attach: DL_ERROR_ACK bad PPA

Description:

Could not attach to the physical device. We are trying to open a fast path to the private transport adapters.

Solution:

Reboot of the node might fix the problem.


362657 Error when sending response from child process: %m

Description:

Error occurred when sending message from fault monitor child process. Child process will be stopped and restarted.

Solution:

If error persists, then disable the fault monitor and resport the problem.


363357 Failed to unregister callback for IPMP group %s with tag %s (request failed with %d).

Description:

An unexpected error occurred while trying to communicate with the network monitoring daemon (pnmd).

Solution:

Make sure the network monitoring daemon (pnmd) is running.


363505 check_for_ccrdata failed strdup for (%s)

Description:

Call to strdup failed. The "strdup" man page describes possible reasons.

Solution:

Install more memory, increase swap space or reduce peak memory consumption.


363972 reservation message(%s) - Waiting for reservation lock

Description:

Locking is used by the device fencing program to ensure correct behavior when different nodes see different cluster memberships. This node is waiting for an instance of the device fencing program on another node to complete.

Solution:

The lock should eventually be granted. If node failures are involved, the lock will not be granted until node deaths are assured, which may take a few minutes. If the lock is eventually granted, no user action is required. If the lock is not granted, your authorized Sun service provider should be contacted to help diagnose the problem.


364188 Validation failed. Listener_name not set

Description:

'Listener_name' property of the resource is not set. HA-Oracle will not be able to manage Oracle listener if Listener name is not set.

Solution:

Specify correct 'Listener_name' when creating resource. If resource is already created, please update resource property.


364510 The specified Oracle dba group id (%s) does not exist

Description:

Group id of oracle dba does not exist.

Solution:

Make sure /etc/nswitch.conf and /etc/group files are valid and have correct information to get the group id of dba.


366225 Listener %s stopped successfully

Description:

Informational message. HA-Oracle successfully stopped Oracle listener.

Solution:

None


366769 The Hosts in the startup order are not up. Waiting for them to start....

Description:

The probe has detected that the BV processes are not running but cannot take any action because the BV hosts in the startup orderare not running.

Solution:

If the Resource Groups which contain the Backend resources arenot online then bring them online. If they are online then probablythe BV processes are in the process of coming up and so no need totake any action.


367270 INTERNAL ERROR: Failed to create the path to the %s file.

Description:

An internal error has occurred.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider forassistance in diagnosing the problem.


367617 reservation fatal error(%s) - Invalid file format '%s'

Description:

The device fencing program has suffered an internal error.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available. Copies of /var/adm/messages from all nodes should be provided for diagnosis. It may be possible to retry the failed operation, depending on the nature of the error. If the message specifies the 'node_join' transition, then this node may be unable to access shared devices. If the failure occurred during the 'release_shared_scsi2' transition, then a node which was joining the cluster may be unable to access shared devices. In either case, it may be possible to reacquire access to shared devices by executing '/usr/cluster/lib/sc/run_reserve -c node_join' on all cluster nodes. If the failure occurred during the 'make_primary' transition, then a device group has failed to start on this node. If another node was available to host the device group, then it should have been started on that node. If desired, it may be possible to switch the device group to this node with the scswitch command. If no other node was available, then the device group will not have been started. The scswitch command may be used to retry the attempt to start the device group. If the failure occurred during the 'primary_to_secondary' transition, then the shutdown or switchover of a device group has failed. The desired action may be retried.


367864 svc_init failed.

Description:

The rpc.pmfd server was not able to initialize server operation. This happens while the server is starting up, at boot time. The server does not come up, and an error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


368363 Failed to retrieve the current primary node.

Description:

Cannot retrieve the current primary node for the given resource group.

Solution:

Check the syslog messages that occurred just before this message, to see whether there is any internal error has occurred. If it is, contact your authorized Sun service provider. Otherwise, Check if the resource group is in STOP_FAILED state. If it is, then clear the state and bring the resource group online.


368819 t_rcvudata in recv_request: %s

Description:

Call to t_rcvudata() failed. The "t_rcvudata" man page describes possible error codes. udlm will exit and the node will abort.

Solution:

This is an internal error. Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.


369460 udlm_send_reply %s: udp is null!

Description:

Can not communicate with udlmctl because the address to send to is null.

Solution:

None. udlm will handle this error.


369570 Multiple ports specified in property %s.

Description:

Solution:


370604 This resource depends on a HAStoragePlus resouce that is in a different Resource Group. This configuration is not supported.

Description:

The resource depends on a HAStoragePlus resource that is configured in a different resource group. This configuration is not supported.

Solution:

Please add this resource and the HAStoragePlus resource in the same resource group.


370949 created %d threads to launch resource callback methods; desired number = %d

Description:

The rgmd was unable to create the desired number of threads upon starting up. This is not a fatal error, but it may cause RGM reconfigurations to take longer because it will limit the number of tasks that the rgmd can perform concurrently.

Solution:

Make sure that the hardware configuration meets documented minimum requirements. Examine other syslog messages on the same node to see if the cause of the problem can be determined.


371297 %s: Invalid command line option. Use -S for secure mode

Description:

rpc.sccheckd should always be invoked in secure mode. If this message shows up, someone has modified configuration files that affects server startup.

Solution:

Reinstall cluster packages or contact your service provider.


371369 CCR: CCR data server on node %s unreachable while updating table %s.

Description:

While the TM was updating the indicated table in the cluster, the specified node went down and has become unreachable.

Solution:

The specified node needs to be rebooted.


372880 CMM: Quorum device %ld (gdevname %s) can not be acquired by the current cluster members. This quorum device is held by node%s %s.

Description:

This node does not have its reservation key on the specified quorum device, which has been reserved by the specified node or nodes that the local node can not communicate with. This indicates that in the last incarnation of the cluster, the other nodes were members whereas the local node was not, indicating that the CCR on the local node may be out-of-date. In order to ensure that this node has the latest cluster configuration information, it must be able to communicate with at least one other node that was a member of the previous cluster incarnation. These nodes holding the specified quorum device may either be down or there may be up but the interconnect between them and this node may be broken.

Solution:

If the nodes holding the specified quorum devices are up, then fix the interconnect between them and this node so that communication between them is restored. If the nodes are indeed down, boot one of them.


372887 HA: repl_mgr: exception occurred while invoking RMA

Description:

An unrecoverable failure occurred in the HA framework.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


373148 The port portion of %s at position %d in property %s is not a valid port.

Description:

The property named does not have a legal value. The position index, which starts at 0 for the first element in the list, indicates which element in the property list was invalid.

Solution:

Assign the property a legal value.


373816 clcomm: copyinstr: max string length %d too long

Description:

The system attempted to copy a string from user space to the kernel. The maximum string length exceeds length limit.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


374006 prog <%s> failed on step <%s> retcode <%d>

Description:

ucmmd step failed on a step.

Solution:

Examine other syslog messages occurring at about the same time to see if the problem can be identified and if it recurs. Save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance.


374738 dl_bind: DL_BIND_ACK bad sap %u

Description:

SAP in acknowledgment to bind request is different from the SAP in the request. We are trying to open a fast path to the private transport adapters.

Solution:

Reboot of the node might fix the problem.


376111 Unable to compose %s path. Sending SIGKILL now.

Description:

The STOP method was not able to construct the applications stop command. The STOP method will send SIGKILL to stop the application.

Solution:

Other messages will indicate what the underlying problem was such as no memory or a bad configuration.


376905 Failed to retrieve WLS extension properties.Will shutdown the WLS using sigkill

Description:

Failed to retrieve the WLS exension properties that are needed to do a smooth shutdown. The WLS stop method however will go ahead and kill the WLS process.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


376974 Error in initialization; exiting.

Description:

Solution:


377210 Failed to retrieve BV extension properties.

Description:

Failed to retrieve the Extension properties set by the user orFailed to retrieve a valid host for BV processes.

Solution:

Look for other error messages generated while retrieving thethe extension properties to identify the exact error. Look for appropriate action for that error message.


377347 CMM: Node %s (nodeid = %ld) is up; new incarnation number = %ld.

Description:

The specified node has come up and joined the cluster. A node is assigned a unique incarnation number each time it boots up.

Solution:

This is an informational message, no user action is needed.


377531 Stop saposcol under PMF times out.

Description:

Stopping the SAP OS collector process under the control of Process Monitor facility times out. This might happen under heavy system load.

Solution:

You might consider increase the stop time out value.


378427 prog <%s> step <%s> terminated due to receipt of signal

Description:

ucmmd step terminated due to receipt of a signal.

Solution:

Examine other syslog messages occurring at about the same time to see if the problem can be identified and if it recurs. Save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance.


377897 Successfully started the service

Description:

Informational message. SAP started up successfully.

Solution:

No action needed.


378220 Siebel gateway already running.

Description:

Siebel gateway was not expected to be running. This may be due to the gateway having started outside Sun Cluster control.

Solution:

Please shutdown the gateway instance manually, and retry the previous operation.


378807 clexecd: %s: sigfillset returned %d. Exiting.

Description:

clexecd program has encountered a failed sigfillset(3C) system call. The error message indicates the error number for the failure.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


378872 %s operation failed: %s.

Description:

Specified system operation could not complete successfully.

Solution:

This is as an internal error. Contact your authorized Sun service provider with the following information. 1) Saved copy of /var/adm/messages file. 2) Output of "ifconfig -a" command.


379450 reservation fatal error(%s) - fenced_node not specified

Description:

The device fencing program has suffered an internal error.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available. Copies of /var/adm/messages from all nodes should be provided for diagnosis. It may be possible to retry the failed operation, depending on the nature of the error. If the message specifies the 'node_join' transition, then this node may be unable to access shared devices. If the failure occurred during the 'release_shared_scsi2' transition, then a node which was joining the cluster may be unable to access shared devices. In either case, it may be possible to reacquire access to shared devices by executing '/usr/cluster/lib/sc/run_reserve -c node_join' on all cluster nodes. If the failure occurred during the 'make_primary' transition, then a device group has failed to start on this node. If another node was available to host the device group, then it should have been started on that node. If desired, it may be possible to switch the device group to this node with the scswitch command. If no other node was available, then the device group will not have been started. The scswitch command may be used to retry the attempt to start the device group. If the failure occurred during the 'primary_to_secondary' transition, then the shutdown or switchover of a device group has failed. The desired action may be retried.


379574 Unable to lock %s: %s.


380064 switchback attempt failed on resource group <%s> with error <%s>

Description:

The rgmd was unable to failback the specified resource group to a more preferred node. The additional error information in the message explains why.

Solution:

Examine other syslog messages occurring around the same time on the same node. These messages may indicate further action.


380317 Failed to verify that all IPMP groups are in a stable state. Assuming this node cannot respond to client requests.

Description:

The state of the IPMP groups on the node could not be determined.

Solution:

Make sure all adapters and cables are working. Look in the /var/adm/messages file for message from the network monitoring daemon (pnmd).


380365 (%s) t_rcvudata, res %d, flag %d: tli error: %s

Description:

Call to t_rcvudata() failed. The "t_sndudata" man page describes possible error codes. udlmctl will exit.

Solution:

Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.


380897 rebalance: WARNING: resource group <%s> is <%s> on node <%d>, resetting to OFFLINE.

Description:

The resource group has been found to be in the indicated state and is being reset to OFFLINE. This message is a warning only and should not adversely affect the operation of the RGM.

Solution:

Since this problem might indicate an internal logic error in the rgmd, please save a copy of the /var/adm/messages files on all nodes, the output of an scstat -g command, and the output of a scrgadm -pvv command. Report the problem to your authorized Sun service provider.


381244 in libsecurity mkdir of %s failed: %s

Description:

The rpc.pmfd, rpc.fed or rgmd server was not able to create a directory to contain "cache" files for rpcbind information. The affected component should still be able to function by directly calling rpcbind.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


381386 Prog <%s> step <%s>: unkillable.

Description:

The specified callback method for the specified resource became stuck in the kernel, and could not be killed with a SIGKILL. The UCMM reboots the node to prevent the stuck node from causing unavailability of the services provided by UCMM.

Solution:

No action is required. This is normal behavior of the UCMM. Other syslog messages that occurred just before this one might indicate the cause of the method failure.


381765 sema_post: %s

Description:

The rpc.pmfd server was not able to act on a semaphore. The message contains the system error. The server does not perform the action requested by the client, and an error message is output to syslog.

Solution:

Investigate if the machine is running out of memory. If this is not the case, save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


382169 Share path name %s not absolute.

Description:

A path specified in the dfstab file does not begin with "/"

Solution:

Only absolute path names can be shared with HA-NFS.


382252 Share path %s: file system %s is not mounted.

Description:

The specified file system, which contains the share path specified, is not currently mounted.

Solution:

Correct the situation with the file system so that it gets mounted.


382295 Unable to register.


382995 ioctl in negotiate_uid failed

Description:

Call to ioctl() failed. The "ioctl" man page describes possible error codes. ucmmd will exit and the node will abort.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


383706 NULL value returned for the resource property %s.

Description:

NULL value was specified for the extension property of the resource.

Solution:

For the property name check the syslog message. Any of the following situations might have occurred. Different user action is needed for these different scenarios. 1) If a new resource is created or updated, check whether the value of the extension property is empty. If it is, provide valid value using scrgadm(1M). 2) For all other cases, treat it as an Internal error. Contact your authorized Sun service provider.


384549 CCR: Could not backup the CCR table %s errno = %d.

Description:

The indicated error occurred while backing up indicated CCR table on this node. The errno value indicates the nature of the problem. errno values are defined in the file /usr/include/sys/errno.h. An errno value of 28(ENOSPC) indicates that the root file system on the indicated node is full. Other values of errno can be returned when the root disk has failed(EIO) or some of the CCR tables have been deleted outside the control of the cluster software(ENOENT).

Solution:

There may be other related messages on this node, which may help diagnose the problem, for example: If the root file system is full on the node, then free up some space by removing unnecessary files. If the root disk on the afflicted node has failed, then it needs to be replaced. If the indicated CCR table was accidently deleted, then boot this node in -x mode to restore the indicated CCR table from other nodes in the cluster or backup. The CCR tables are located at /etc/cluster/ccr/.


384621 RDBMS probe successful

Description:

This message indicates that Fault monitor has successfully probed the RDBMS server

Solution:

No action required. This is informational message.


384820 libsecurity: rpc_createerror: %s

Description:

A client of the rpc.pmfd, rpc.fed or rgmd server was not able to initiate an rpc connection. The error message generated with a call to clnt_spcreateerror(3NSL) is appended.

Solution:

Save the /var/adm/messages file. Check the messages file for earlier errors related to the rpc.pmfd, rpc.fed, or rgmd server.


385407 t_alloc (open_cmd_port) failed with errno%d

Description:

Call to t_alloc() failed. The "t_alloc" man page describes possible error codes. ucmmd will exit and the node will abort.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


385550 Can't setup binding entries from node %d for GIF node %d

Description:

Failed to maintain client affinity for some sticky services running on the named server node due to a problem on the named GIF node. Connections from existing clients for those services might go to a different server node as a result.

Solution:

If client affinity is a requirement for some of the sticky services, say due to data integrity reasons, switchover all global interfaces (GIFs) from the named GIF node to some other node.


385902 pmf_search_children: Error signaling <%s>: %s

Description:

An error occured while rpc.pmfd attempted to send a signal to one of the processes of the given tag. The reason for the failure is also given. The signal was sent to the process as a result of some event external to rpc.pmfd. rpc.pmfd "intercepted" the signal, and is trying to pass the signal on to the monitored process.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


386024 ERROR: rebalance: duplicate nodeid <%d> in Nodelist of resource group <%s>; continuing

Description:

The same nodename appears twice in the Nodelist of the given resource group. Although non-fatal, this should not occur and may indicate an internal logic error in the rgmd.

Solution:

Use scrgadm -pv to check the Nodelist of the affected resource group. Save a copy of the /var/adm/messages files on all nodes, and report the problem to your authorized Sun service provider.


386072 chdir: %s

Description:

The rpc.pmfd server was not able to change directory. The message contains the system error. The server does not perform the action requested by the client, and an error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


386908 Resource is already stopped.

Description:

An attempt was made to stop a resource that has already been stopped.

Solution:

Using the ps command check to make sure that all processes for the Data Service have been stopped. Check syslog for any possible errors which may have occured just before this message. If everything appears to be correct, then no action is required.


386995 Failed to parse xml: NULL attribute

Description:

Solution:


387003 CCR: CCR metadata not found.

Description:

The CCR is unable to locate its metadata.

Solution:

Boot the offending node in -x mode to restore the indicated table from backup or other nodes in the cluster. The CCR tables are located at /etc/cluster/ccr/.


387150 scvxvmlg warning - found no matching volume for device node %s, removing it

Description:

The program responsible for maintaining the VxVM device namespace has discovered inconsistencies between the VxVM device namespace on this node and the VxVM configuration information stored in the cluster device configuration system. If configuration changes were made recently, then this message should reflect one of the configuration changes. If no changes were made recently or if this message does not correctly reflect a change that has been made, the VxVM device namespace on this node may be in an inconsistent state. VxVM volumes may be inaccessible from this node.

Solution:

If this message correctly reflects a configuration change to VxVM diskgroups then no action is required. If the change this message reflects is not correct, then the information stored in the device configuration system for each VxVM diskgroup should be examined for correctness. If the information in the device configuration system is accurate, then executing '/usr/cluster/lib/dcs/scvxvmlg' on this node should restore the device namespace. If the information stored in the device configuration system is not accurate, it must be updated by executing '/usr/cluster/bin/scconf -c -D name=diskgroup_name' for each VxVM diskgroup with inconsistent information.


387232 resource %s monitor enabled.

Description:

This is a notification from the rgmd that the operator has enabled monitoring on a resource. This may be used by system monitoring tools.

Solution:

This is an informational message, no user action is needed.


387288 clcomm: Path %s online

Description:

A communication link has been established with another node.

Solution:

No action required.


387572 File %s should be readable and writable by %s.

Description:

A program required the specified file to be readable and writable by the specified user.

Solution:

Set correct permissions for the specified file to allow the specified user to read it and write to it.


388330 Text server stopped.

Description:

The Text server has been stopped by Sun Cluster HA for Sybase.

Solution:

This is an informational message, no user action is needed.


389221 could not open configuration file: %s

Description:

The specified configuration file could not be opened.

Solution:

Check if the configuration file exists and has correct permissions. If the problem persists, contact your Sun Service representative.


389231 clcomm: inbound_invo::cancel:_state is 0x%x

Description:

The internal state describing the server side of a remote invocation is invalid when a cancel message arrives during processing of the remote invocation.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


389369 Validation failed. SYBASE ASE STOP_FILE %s is not executable.

Description:

File specified in the STOP_FILE extension property is not an executable file.

Solution:

Please check the permissions of file specified in the STOP_FILE extension property. File should be executable by the Sybase owner and root user.


389516 NULL value returned for the extension property %s.

Description:

NULL value was specified for the extension property of the resource.

Solution:

For the property name check the syslog message. Any of the following situations might have occurred. Different user action is needed for these different scenarios. 1) If a new resource is created or updated, check whether the value of the extension property is empty. If it is, provide valid value using scrgadm(1M). 2) For all other cases, treat it as an Internal error. Contact your authorized Sun service provider.


389901 ext_props(): Out of memory

Description:

System runs out of memory in function ext_props().

Solution:

Install more memory, increase swap space, or reduce peak memory consumption.


390130 Failed to allocate space for %s.

Description:

An internal error has occurred.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider forassistance in diagnosing the problem.


390691 NFS daemon down

Description:

HA-NFS fault monitor detected that an nfs daemon died and will automatically restart it later.

Solution:

No action required.


391177 Failed to open %s: %s.


391738 (%s) bad poll revent: %x (hex)

Description:

Call to poll() failed. The "poll" man page describes possible error codes. udlmctl will exit.

Solution:

Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.


392782 Failed to retrieve the property %s for %s: %s.

Description:

API operation has failed in retrieving the cluster property.

Solution:

For property name, check the syslog message. For more details about API call failure, check the syslog messages from other components.


393385 Service daemon not running.

Description:

Process group has died and the data service's daemon is not running. Updating the resource status.

Solution:

Wait for the fault monitor to restart or failover the data service. Check the configuration of the data service.


393934 Stopping the adaptive server with wait option.

Description:

The Sun Cluster HA for Sybase will attempt to shutdown the Sybase adaptive server using the wait option.

Solution:

This is an informational message, no user action is needed.


393960 sigaction failed in set_signal_handler

Description:

The ucmmd has failed to initialize signal handlers by a call to sigaction(2). The error message indicates the reason for the failure. The ucmmd will produce a core file and will force the node to halt or reboot to avoid the possibility of data corruption.

Solution:

Save a copy of the /var/adm/messages files on all nodes and of the ucmmd core. Contact your authorized Sun service provider for assistance in diagnosing the problem.


394325 Received notice that IPMP group %s has failed.

Description:

The status of the named IPMP group has become degraded. If possible, the scalable resources currently running on this node with monitoring enabled will be relocated off of this node, if the IPMP group stays in a degraded state.

Solution:

Check the status of the IPMP group on the node. Try to fix the adapters in the IPMP group.


395353 Failed to check whether the resource is a network address resource.

Description:

While retrieving the IP addresses from the network resources, the attempt to check whether the resource is a network resource or not has failed.

Solution:

Internal error or API call failure might be the reasons. Check the error messages that occurred just before this message. If there is internal error, contact your authorized Sun service provider. For API call failure, check the syslog messages from other components.


396134 Register callback with NAFO %s failed: Error %d.

Description:

LogicalHostname resource was unable to register with IPMP for status updates.

Solution:

Most likely it is result of lack of system resources. Check for memory availability on the node. Reboot the node if problem persists.


396727 Attempting to check for existence of %s pid %d resulted in error: %s.

Description:

HA-NFS fault monitor attempted to check the status of the specified process but failed. The specific cause of the error is logged.

Solution:

No action. HA-NFS fault monitor would ignore this error and would attempt this operation at a later time. If this error persists, check to see if the system is lacking the required resources (memory and swap) and add or free resources if required. Reboot the node if the error persists.


397020 unix DLM abort failed

Description:

Failed to abort unix dlm. This is an error that can be ignored.

Solution:

None.


397219 RGM: Could not allocate %d bytes; node is out of swap space; aborting node.

Description:

The rgmd failed to allocate memory, most likely because the system has run out of swap space. The rgmd will produce a core file and will force the node to halt or reboot to avoid the possibility of data corruption.

Solution:

The problem is probably cured by rebooting. If the problem recurs, you might need to increase swap space by configuring additional swap devices. See swap(1M) for more information.


397340 Monitor initialization error. Unable to open resource: %s Group: %s: error %d

Description:

Error occured in monitor initialization. Monitor is unable to get resource information using API calls.

Solution:

Check syslog messages for errors logged from other system modules. Stop and start fault monitor. If error persists then disable fault monitor and report the problem.


397371 'dbmcli -d <LC_NAME> -n <logical host name> db_state' timed out.

Description:

The SAP utility listed timed out.

Solution:

Make sure the logical host name resource is online.


398345 Error %d setting policy %d %s

Description:

This message appears when the customer is initializing or changing a scalable services load balancer, by starting or updating a service. An internal error happened while trying to change the load balancing policy.

Solution:

This is an internal error and it could happen if another RGM are operation were happening at the same time. The user action is to try it again. If it happens when another RMG update is not happening, contact your Sun Service provider for help.


398878 reservation fatal error(%s) - dcs_get_service_parameters() error, returned %d

Description:

The device fencing program has suffered an internal error.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available. Copies of /var/adm/messages from all nodes should be provided for diagnosis. It may be possible to retry the failed operation, depending on the nature of the error. If the message specifies the 'node_join' transition, then this node may be unable to access shared devices. If the failure occurred during the 'release_shared_scsi2' transition, then a node which was joining the cluster may be unable to access shared devices. In either case, it may be possible to reacquire access to shared devices by executing '/usr/cluster/lib/sc/run_reserve -c node_join' on all cluster nodes. If the failure occurred during the 'make_primary' transition, then a device group has failed to start on this node. If another node was available to host the device group, then it should have been started on that node. If desired, it may be possible to switch the device group to this node with the scswitch command. If no other node was available, then the device group will not have been started. The scswitch command may be used to retry the attempt to start the device group. If the failure occurred during the 'primary_to_secondary' transition, then the shutdown or switchover of a device group has failed. The desired action may be retried.


398973 The probe has requested an immediate failover. Attempting to failover this resource group subject to the setting of the Failover_enabled property.

Description:

An immediate failover will be performed.

Solution:

This is an informational message, no user action is needed.


399037 mc_closeconn failed to close connection

Description:

The system has run out of resources that is required to process connection terminations for a scalable service.

Solution:

If cluster networking is required, add more resources (most probably, memory) and reboot.


399216 clexecd: Got an unexpected signal %d in process %s (pid=%d, ppid=%d)

Description:

clexecd program got a signal indicated in the error message.

Solution:

clexecd program will exit and node will be halted or rebooted to prevent data corruption. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


399266 Cluster goes into pingpong booting because of failure of method <%s> on resource <%s>. RGM is not aborting this node.

Description:

A stop method is failed and Failover_mode is set to HARD, but the RGM has detected this resource group falling into pingpong behavior and will not abort the node on which the resource's stop method failed. This is most likely due to the failure of both resource's start and stop methods.

Solution:

Save a copy of /var/adm/messages, check for both failed start and stop methods of the failing resource, and make sure to have the failure corrected. Refer to the procedure for clearing the ERROR_STOP_FAILED condition on a resource group in the Sun Cluster Administration Guide to restart resource group.


399753 CCR: CCR data server failed to register with CCR transaction manager.

Description:

The CCR data server on this node failed to join the cluster, and can only serve readonly requests.

Solution:

There may be other related CCR messages on this and other nodes in the cluster, which may help diagnose the problem. It may be necessary to reboot this node or the entire cluster.

Message IDs 400000–499999


400592 UNIX DLM is asking for a reconfiguration to recover from a communication error. This message is acceptable during a reconfiguration already in progress.

Description:

The cluster will reconfigure.

Solution:

None.


400855 Processes on $HOSTNAME are stopped.

Description:

This is just an informational message that the BV processeson the specified host have stopped.

Solution:

No action required.


401115 t_rcvudata (recv_request) failed

Description:

Call to t_rcvudata() failed. The "t_rcvudata" man page describes possible error codes. ucmmd will exit and the node will abort.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


401252 Validation failed. Resource property FAILOVER_MODE must be NONE

Description:

The resource being created or modified must have a value of NONE for its FAILOVER_MODE property.

Solution:

Specify NONE for the FAILOVER_MODE property.


401400 Successfully stopped the application

Description:

The STOP method successfully stopped the resource.

Solution:

This message is informational; no user action is needed.


401573 INTERNAL ERROR: START method not registered for resource <%s>

Description:

A non-fatal internal error has occurred in the rgmd state machine.

Solution:

Since this problem might indicate an internal logic error in the rgmd, please save a copy of the /var/adm/messages files on all nodes, the output of an scstat -g command, and the output of a scrgadm -pvv command. Report the problem to your authorized Sun service provider.


402289 t_bind: %s

Description:

Call to t_bind() failed. The "t_bind" man page describes possible error codes. udlm will exit and the node will abort.

Solution:

This is an internal error. Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.


402484 NULL command string passed.

Description:

A NULL value was specified for the command argument.

Solution:

Specify a non-NULL value for the command string.


402992 Failfast: Destroying failfast unit %s while armed.

Description:

The specified failfast unit was destroyed while it was still armed.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


403257 Failed to start Backup server.

Description:

Sun Cluster HA for Sybase failed to start the backup server. Other syslog messages and the log file will provide additional information on possible reasons for the failure.

Solution:

Please whether the server can be started manually. Examine the HA-Sybase log files, backup server log files and setup.


404309 in libsecurity cred flavor is not AUTH_SYS

Description:

A server (rpc.pmfd, rpc.fed or rgmd) refused an rpc connection from a client because because the authorization is not of UNIX type. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


404866 method_full_name: malloc failed

Description:

The rgmd server was not able to create the full name of the method, while trying to connect to the rpc.fed server, probably due to low memory. An error message is output to syslog.

Solution:

Investigate if the host is running out of memory. If not save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


405030 Hosts in the startup order are not up. The Probe will start the processes on %s

Description:

The resource group containing the specified hostwill be online but the BV processes will not be started because the hosts in the startup order(backend hosts) are not up. The Probe will wait for these hosts to startup before starting the processes on the specified host.

Solution:

If the Resource Groups which contain the Backend resources arenot online then bring them online. If they are online then probablythe BV processes are in the process of coming up and so no need totake any action,the probe will take the appropriate action.


405201 Validation failed. Resource group property NODELIST must contain only 1 node

Description:

The resource being created or modified must belong to a group that can have only one node name in it's NODELIST property.

Solution:

Specify just one node in the NODELIST property.


405508 clcomm: Adapter %s has been deleted

Description:

A network adapter has been removed.

Solution:

No action required.


405552 Unable to contact fault monitor, restarting service.

Description:

The process monitoring facility tried to send a message to the fault monitor noting that the data service application died. It was unable to do so.

Solution:

Since some part (daemon) of the application has failed, it would be restarted. If fault monitor is not yet started, wait for it to be started by Sun Cluster framework. If fault monitor has been disabled, enable it using scswitch.


405989 %s can't plumb %s

Description:

This means that the Logical IP address could not be plumbed on an adapter belonging to the named IPMP group.

Solution:

There could be other related error messages which might be helpful. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


406610 st_ff_arm failed: %s

Description:

The rpc.pmfd server was not able to initialize the failfast mechanism. This happens while the server is starting up, at boot time. The server does not come up, and an error message is output to syslog. The message contains the system error.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


406612 Failed to unplumb <%s> from <%s>.


406635 fatal: joiners_run_boot_methods: exiting early because of unexpected exception

Description:

The low-level cluster machinery has encountered a fatal error. The rgmd will produce a core file and will cause the node to halt or reboot to avoid the possibility of data corruption.

Solution:

Save a copy of the /var/adm/messages files on all nodes, and of the rgmd core file. Contact your authorized Sun service provider for assistance in diagnosing the problem.


407784 socket: %s

Description:

Solution:


407784 socket: %s

Description:

Solution:


408164 Invalid value for property %s.

Description:

Solution:


408214 Failed to create scalable service group %s: %s.

Description:

A call to the underlying scalable networking code failed.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


408282 clcomm: RT or TS classes not configured

Description:

The system requires either real time or time sharing thread scheduling classes for use in user processes. Neither class is available.

Solution:

Configure Solaris to support either real time or time sharing or both thread scheduling classes for user processes.


408672 Removing file %s.

Description:

HA-NetBackup removes NetBackup startup and shutdown scripts from /etc/rc2.d and /etc/rc0.d to prevent automatic startup and shutdown of NetBackup.

Solution:

None. This is only an informational message.


408742 svc_setschedprio: Could not save current scheduling parameters: %s

Description:

The server was not able to save the original scheduling mode. The system error message is shown. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


409267 Error opening procfs control file (for parent process) <%s> for tag <%s>: %

Description:

The rpc.pmfd server was not able to open the procfs control file for the parent process, and the system error is shown. procfs control files are required in order to monitor user processes.

Solution:

Investigate if the machine is running out of memory. If this is not the case, save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


409443 fatal: unexpected exception in rgm_init_pres_state

Description:

This node encountered an unexpected error while communicating with other cluster nodes during a cluster reconfiguration. The rgmd will produce a core file and will cause the node to halt or reboot.

Solution:

Save a copy of the /var/adm/messages files on all nodes, and of the rgmd core file. Contact your authorized Sun service provider for assistance in diagnosing the problem.


409693 Aborting startup: failover of NFS resource groups may be in progress.

Description:

Startup of an NFS resource was aborted because a failure was detected by another resource group, which would be in the process of failover.

Solution:

Attempt to start the NFS resource after the failover is completed. It may be necessary to start the resource on another node if current node is not healthy.


410176 Failed to register callback for IPMP group %s with tag %s and callback command %s (request failed with %d).

Description:

An unexpected error occurred while trying to communicate with the network monitoring daemon (pnmd).

Solution:

Make sure the network monitoring daemon (pnmd) is running.


410860 lkcm_act: cm_reconfigure failed: %s

Description:

ucmm reconfiguration failed. This could also point to a problem with the interconnect components.

Solution:

None if the next reconfiguration succeeds. If not, save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.


411227 Failed to stop the process with: %s. Retry with SIGKILL.

Description:

Process monitor facility is failed to stop the data service. It is reattempting to stop the data service.

Solution:

This is informational message. Check the Stop_timeout and adjust it, if it is not appropriate value.


411369 Not found clexecd on node %d for %d seconds. Giving up!

Description:

Could not find clexecd to execute the program on a node. Indicated giving up after retries.

Solution:

This is an informational message, no user action is needed.


412106 Internal Error. Unable to get fault monitor name

Description:

This is an internal error. Could not determine fault monitor program name.

Solution:

Please report this problem.


412366 setsid failed: %s

Description:

Failed to run the "setsid" command. The "setsid" man page describes possible error codes.

Solution:

None. ucmmd will exit.


412533 clcomm: validate_policy: invalid relationship moderate %d low %d pool %d

Description:

The system checks the proposed flow control policy parameters at system startup and when processing a change request. The moderate server thread level cannot be less than the low server thread level.

Solution:

No user action required.


412558 inet addr %s length %d = %s

Description:

Information about hosts.

Solution:

None.


413513 INTERNAL ERROR Failfast: ff_impl_shouldnt_happen.

Description:

An internal error has occurred in the failfast software.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


413569 CCR: Invalid CCR table : %s.

Description:

CCR could not find a valid version of the indicated table on the nodes in the cluster.

Solution:

There may be other related messages on the nodes where the failure occurred. They may help diagnose the problem. If the indicated table is unreadable due to disk failure, the root disk on that node needs to be replaced. If the table file is corrupted or missing, boot the cluster in -x mode to restore the indicated table from backup. The CCR tables are located at /etc/cluster/ccr/.


414680 fatal: register_president: Don't have reference to myself

Description:

The low-level cluster machinery has encountered a fatal error. The rgmd will produce a core file and will cause the node to halt or reboot to avoid the possibility of data corruption.

Solution:

Save a copy of the /var/adm/messages files on all nodes, and of the rgmd core file. Contact your authorized Sun service provider for assistance in diagnosing the problem.


415842 fatal: scswitch_onoff: invalid opcode <%d>

Description:

While attempting to execute an operator-requested enable or disable of a resource, the rgmd has encountered an internal error. This error should not occur. The rgmd will produce a core file and will force the node to halt or reboot to avoid the possibility of data corruption.

Solution:

Save a copy of the /var/adm/messages files on all nodes, and of the rgmd core file. Contact your authorized Sun service provider for assistance in diagnosing the problem.


416904 Orbixd Probe failed

Description:

Just an informational message that the orbix daemon probe failed.

Solution:

No action needed. The probe will take appropriate message.


416483 Failed to retrieve the resource information.

Description:

A Sun Cluster data service is unable to retrieve the resource information. Low memory or API call failure might be the reasons.

Solution:

In case of low memory, the problem will probably cured by rebooting. If the problem reoccurs, you might need to increase swap space by configuring additional swap devices. Otherwise, if it is API call failure, check the syslog messages from other components.


417144 Must be root to start %s

Description:

The program or daemon has been started by someone not in superuser mode.

Solution:

Login as root and run the program. If it is a daemon, it may be incorrectly installed. Reinstall cluster packages or contact your service provider.


417629 Database or gateway down.

Description:

This indicates that the Siebel database or Siebel gateway is unavailable for the Siebel server.

Solution:

Please determine the reason for Siebel database or Siebel gateway failure, and ensure that they are both running. If the Siebel server resource is not offline, it should get started by the fault monitor.


417903 clexecd: waitpid returned %d.

Description:

clexecd program has encountered a failed waitpid(2) system call. The error message indicates the error number for the failure.

Solution:

clexecd program will exit and node will be halted or rebooted to prevent data corruption. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


419220 %s restore operation failed.

Description:

In the process of creating a shared address resource the system was attempting to reconfigure the ip addresses on the system. The specified operation failed.

Solution:

Use ifconfig command to make sure that all the ip addresses are present. If not, remove the shared address resource and run scrgadm command to recreate it. If problem persists, reboot.


419291 Unable to connect to Siebel gateway.

Description:

Siebel gateway may be unreachable.

Solution:

Please verify that the Siebel gateway resource is up.


419301 The probe command <%s> timed out

Description:

Timeout occured when executing the probe command provided by user under the hatimerun(1M) utility.

Solution:

This problem may occur when the cluster is under load. You may consider increasing the Probe_timeout property.


419529 INTERNAL ERROR CMM: Failure registering callbacks.

Description:

An instance of the userland CMM encountered an internal initialization error.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


419972 clcomm: Adapter %s is faulted

Description:

A network adapter has encountered a fault.

Solution:

Any interconnect failure should be resolved, and/or a failed node rebooted.


420591 BV Config Error:IMs configured on both physical and private interconnect.

Description:

The Interaction Managers are configured on both the physical hostas well as the Private host. This is not supported. The InteractionManagers should be configured on only on one,either the physicalnode or the cluster private node.

Solution:

Reconfigure the Broadvision servers with IMs only on either physicalnode or on cluster private IP. Refer to the HA-BV installation and configuration guide.


420763 Switchover (%s) error (%d) after failure to become secondary

Description:

The file system specified in the message could not be hosted on the node the message came from.

Solution:

Check /var/adm/messages to make sure there were no device errors. If not, contact your authorized Sun service provider to determine whether a workaround or patch is available.


422190 Failed to reboot node: %s.

Description:

HA-NFS fault monitor was attempting to reboot the node, because rpcbind daemon was unresponsive. However, the attempt to reboot the node itself did not succeed.

Solution:

Fault monitor would exit once it encounters this error. However, process monitoring facility would restart it (if enough resources are available on the system). If rpcbind remains unresponsive, the fault monitor (restarted by PMF) would again attempt to reboot the node. If this problem persists, reboot the node. Also see message id 804791.


422214 CMM: Votecount changed from %d to %d for quorum device %ld (%s).

Description:

The votecount for the specified quorum device has been changed as indicated.

Solution:

This is an informational message, no user action is needed.


422541 Failed to register with PDTserver

Description:

This means that we have lost communication with PDT server. Scalable services will not work any more. Probably, the nodes which are configured to be the primaries and secondaries for the PDT server are down.

Solution:

Need to restart any of the nodes which are configured be the primary or secondary for the PDT server.


423538 WARNING: UDLM_PROCEED was picked up by a lkcm_act, returning LKCM_NOOP

Description:

An internal warning during udlm state update.

Solution:

None.


423958 resource group %s state change to unmanaged.

Description:

This is a notification from the rgmd that a resource group's state has changed. This may be used by system monitoring tools.

Solution:

This is an informational message, no user action is needed.


424061 Validation failed. ORACLE_HOME %s does not exist

Description:

Directory specified as ORACLE_HOME does not exist. ORACLE_HOME property is specified when creating Oracle_server and Oracle_listener resources.

Solution:

Specify correct ORACLE_HOME when creating resource. If resource is already created, please update resource property 'ORACLE_HOME'.


424095 scvxvmlg fatal error - %s does not exist

Description:

The program responsible for maintaining the VxVM namespace was unable to access the global device namespace. If configuration changes were recently made to VxVM diskgroups or volumes, this node may be unaware of those changes. Recently created volumes may be unaccessible from this node.

Solution:

Verify that the /global/.devices/node@N (N = this node's node number) is mounted globally and is accessible. If no configuration changes have been recently made to VxVM diskgroups or volumes and all volumes continue to be accessible from this node, then no further action is required. If changes have been made, the device namespace on this node can be updated to reflect those changes by executing '/usr/cluster/lib/dcs/scvxvmlg'. If the problem persists, contact your authorized Sun service provider to determine whether a workaround or patch is available.


424774 Resource group <%s> requires operator attention due to STOP failure

Description:

This is a notification from the rgmd that a resource group has had a STOP method failure or timeout on one of its resources. The resource group is in ERROR_STOP_FAILED state. This may cause another operation such as scswitch(1M), scrgadm(1M), or scha_control(1HA,3HA) to fail with a SCHA_ERR_STOPFAILED error.

Solution:

Refer to the procedure for clearing the ERROR_STOP_FAILED condition on a resource group in the Sun Cluster Administration Guide.


424783 pmf_monitor_suspend: PCRUN: %s

Description:

The rpc.pmfd server was not able to suspend the monitoring of a process and the monitoring of the process has been aborted. The message contains the system error.

Solution:

Save the syslog messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


424816 Unable to set automatic MT mode.

Description:

The rpc.pmfd server was not able to set the multi-threaded operation mode. This happens while the server is starting up, at boot time. The server does not come up, and an error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


424834 Failed to connect to %s port %d for secure resource %s.

Description:

An error occurred while the fault monitor was trying to connect to a port specified in the Port_list property for this secure resource.

Solution:

Check to make sure that the Port_list property is correctly set to the same port number that the Netscape Directory Server is running on.


425053 CCR: Can't access table %s while updating it on node %s errno = %d.

Description:

The indicated error occurred while updating the the indicated table on the indicated node. The errno value indicates the nature of the problem. errno values are defined in the file /usr/include/sys/errno.h. An errno value of 28 (ENOSPC) indicates that the root file system on the node is full. Other values of errno can be returned when the root disk has failed (EIO) or some of the CCR tables have been deleted outside the control of the cluster software (ENOENT).

Solution:

There may be other related messages on the node where the failure occurred. These may help diagnose the problem. If the root file system is full on the node, then free up some space by removing unnecessary files. If the indicated table was accidently deleted, then boot the offending node in -x mode to restore the indicated table from other nodes in the cluster. The CCR tables are located at /etc/cluster/ccr/. If the root disk on the afflicted node has failed, then it needs to be replaced.


425551 getnetconfigent (open_cmd_port) failed

Description:

Call to getnetconfigent failed and ucmmd could not get network information. ucmmd will exit and the node will abort.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


426221 CMM: Reservation key changed from %s to %s for node %s (id = %d).

Description:

The reservation key for the specified node was changed. This can only happen due to the CCR infrastructure being changed by hand, which is not a supported operation. The system can not continue, and the node will panic.

Solution:

Boot the node in non-cluster (-x) mode, recover a good copy of the file /etc/cluster/ccr/infrastructure from one of the cluster nodes or from backup, and then boot this node back in cluster mode. If all nodes in the cluster exhibit this problem, then boot them all in non-cluster mode, make sure that the infrastructure files are the same on all of them, and boot them back in cluster mode. The problem should not happen again.


426678 rgmd died

Description:

An inter-node communication failed because the rgmd died on another node. To avoid data corruption, the failfast mechanism will cause that node to halt or reboot.

Solution:

No action is required. The cluster will reconfigure automatically. Examine syslog output on the rebooted node to determine the cause of node death. The syslog output might indicate further remedial actions.


427129 char*fmt


429663 Node %s not in list of configured nodes

Description:

The specified scalable service could not be started on this node because the node is not in the list of configured nodes for this particular service.

Solution:

If the specified service needs to be started on this node, use scrgadm to add the node to the list of configured nodes for this service and then restart the service.


429819 Monitor_retry_interval is not set.

Description:

The resource property Monitor_retry_interval is not set. This property specifies the time interval between two restarts of the fault monitor.

Solution:

Check whether this property is set. Otherwise, set it using scrgadm(1M).


429820 NetBackup daemon <%s> is not running.

Description:

One of the NetBackup master daemons ('bprd', 'bpdbm', 'vmd') is not running.

Solution:

None, informational message.


429907 clexecd: waitpid returned %d. Returning %d to clexecd.

Description:

clexecd program has encountered a failed waitpid(2) system call. The error message indicates the error number for the failure.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


430445 Monitor initialization error. Incorrect arguments

Description:

Error occured in monitor initialization. Arguments passed to the monitor by callback methods were incorrect.

Solution:

This is an internal error. Disable the monitor and report the problem.


432144 %d entries found in property %s. For a secure %s instance %s should have one or two entries.

Description:

Since a secure Server instance can listen on only one or two ports, the specified property should have either one or two entries. A different number of entries was found.

Solution:

Change the number of entries to be either one or two.


432166 Partially successful probe of %s port %d for non-secure resource %s. (%s)

Description:

The probe was only partially successful because of the reason given.

Solution:

If the problem persists the fault monitor will correct it by doing a restart or failover. For more error description, look at the syslog messages.


432222 %s is not a valid IPMP group on this node.

Description:

Validation of the adapter information has failed. The specified IPMP group does not exist on this node.

Solution:

Create appropriate IPMP group on this node or recreate the logical host with correct IPMP group.


432473 reservation fatal error(%s) - joining_node not specified

Description:

The device fencing program has suffered an internal error.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available. Copies of /var/adm/messages from all nodes should be provided for diagnosis. It may be possible to retry the failed operation, depending on the nature of the error. If the message specifies the 'node_join' transition, then this node may be unable to access shared devices. If the failure occurred during the 'release_shared_scsi2' transition, then a node which was joining the cluster may be unable to access shared devices. In either case, it may be possible to reacquire access to shared devices by executing '/usr/cluster/lib/sc/run_reserve -c node_join' on all cluster nodes. If the failure occurred during the 'make_primary' transition, then a device group has failed to start on this node. If another node was available to host the device group, then it should have been started on that node. If desired, it may be possible to switch the device group to this node with the scswitch command. If no other node was available, then the device group will not have been started. The scswitch command may be used to retry the attempt to start the device group. If the failure occurred during the 'primary_to_secondary' transition, then the shutdown or switchover of a device group has failed. The desired action may be retried.


432987 Failed to retrieve nodeid.

Description:

Data service is failed to retrieve the host information.

Solution:

If the logical host and shared address entries are specified in the /etc/inet/hosts file, check these entries are correct. If this is not the reason then check the health of the name server. For more error information, check the syslog messages.


433438 Setup error. SUPPORT_FILE %s does not exist

Description:

This is an internal error. Support file is used by HA-Oracle to determine the fault monitor information.

Solution:

Please report this problem.


433481 reservation fatal error(%s) - did_get_num_paths() error in is_scsi3_disk(), returned %d

Description:

The device fencing program has suffered an internal error.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available. Copies of /var/adm/messages from all nodes should be provided for diagnosis. It may be possible to retry the failed operation, depending on the nature of the error. If the message specifies the 'node_join' transition, then this node may be unable to access shared devices. If the failure occurred during the 'release_shared_scsi2' transition, then a node which was joining the cluster may be unable to access shared devices. In either case, it may be possible to reacquire access to shared devices by executing '/usr/cluster/lib/sc/run_reserve -c node_join' on all cluster nodes. If the failure occurred during the 'make_primary' transition, then a device group has failed to start on this node. If another node was available to host the device group, then it should have been started on that node. If desired, it may be possible to switch the device group to this node with the scswitch command. If no other node was available, then the device group will not have been started. The scswitch command may be used to retry the attempt to start the device group. If the failure occurred during the 'primary_to_secondary' transition, then the shutdown or switchover of a device group has failed. The desired action may be retried.


433501 fatal: priocntl: %s (UNIX error %d)

Description:

The daemon indicated in the message tag (rgmd or ucmmd) has encountered a failed system call to priocntl(2). The error message indicates the reason for the failure. The daemon will produce a core file and will force the node to halt or reboot to avoid the possibility of data corruption.

Solution:

If the error message is not self-explanatory, save a copy of the /var/adm/messages files on all nodes, and of the core file generated by the daemon. Contact your authorized Sun service provider for assistance in diagnosing the problem.


433895 INTERNAL ERROR: Invalid resource property tunable flag <%d> for property <%s>; aborting node

Description:

An internal error occurred in the rgmd while checking whether a resource property could be modified. The rgmd will produce a core file and will force the node to halt or reboot.

Solution:

Look for other syslog error messages on the same node. Save a copy of the /var/adm/messages files on all nodes, and report the problem to your authorized Sun service provider.


434480 CCR: CCR data server not found.

Description:

The CCR data server could not be found in the local name server.

Solution:

Reboot the node. Also contact your authorized Sun service provider to determine whether a workaround or patch is available.


435521 Warning: node %d does not have a weight assigned to it for property %s, but node %d is in the %s for resource %s. A weight of %d will be used for node %d.

Description:

The named node does not have a weight assigned to it, but it is a potential master of the resource.

Solution:

No user action is required if the default weight is acceptable. Otherwise, use scrgadm(1M) to set the Load_balancing_weights property to include the node that does not have an explicit weight set for it.


436659 Failed to start the adaptive server.

Description:

Sun Cluster HA for Sybase failed to start sybase server. Other syslog messages and the log file will provide additional information on possible reasons for the failure.

Solution:

Please whether the server can be started manually. Examine the HA-Sybase log files, sybase log files and setup.


436871 liveCache is already online.

Description:

liveCache was started up outside of Sun Cluster when Sun Cluster tries to start it up. In this case, Sun Cluster will just put the already started up liveCache under Sun Cluster's control.

Solution:

Informative message, no action is needed.


437100 Validation failed. Invalid command line parameter %s %s

Description:

Unable to process parameters passed to the call back method. This is an internal error.

Solution:

Please report this problem.


437236 dl_bind: DLPI error %u

Description:

DLPI protocol error. We cannot bind to the physical device. We are trying to open a fast path to the private transport adapters.

Solution:

Reboot of the node might fix the problem.


437975 The property %s cannot be updated because it affects the scalable resource %s.

Description:

The property named is not allowed to be changed after the resource has been created.

Solution:

If the property must be changed, then the resource should be removed and re-added with the new value of the property.


438174 No configuration file ${BV1TO1_VAR}/etc/bv1to1.conf.

Description:

The specified configuration file is not found.

Solution:

Check if the Broadvision software was installed properly. Make sure the configuration file is available at the right location.


438420 Interface %s is plumbed but is not suitable for global networking.

Description:

The specified adapter may be either point to point adapter or loopback adapter which is not suitable for global networking.

Solution:

Reconfigure the appropriate IPMP group to exclude this adapter.


438454 request addr > max \"%s\"

Description:

Error from udlm on an address request. Udlm exits and the nodes aborts and panics.

Solution:

This is an internal error. Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.


438700 Some ip addresses might still be on loopback.

Description:

Some of the ip addresses managed by the specified SharedAddress resource were not removed from the loopback interface.

Solution:

Use the ifconfig command to make sure that the ip addresses being managed by the SharedAddress implementation are present either on the loopback interface or on a physical adapter. If they are present on both, use ifconfig to delete them from the loopback interface. Then use scswitch to move the resource group containing the SharedAddresses to another node to make sure that the resource group can be switched over successfully.


438866 sysinfo in getlocalhostname failed

Description:

sysinfo call did not succeed. The "sysinfo" man page describes possible error codes.

Solution:

This is an internal error. Please report this problem.


439099 HA: hxdoor %d.%d does not exist on secondary

Description:

An HA framework hxdoor is missing.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


440394 <%s> operation failed


440406 Cannot check online status. Server processes are running.

Description:

HA-Oracle could not check online status of Oracle server. Oracle server processes are running but Oracle server may or may not be online yet.

Solution:

Examine 'Connetc_string' property of the resource. Make sure that user id and password specified in connect string are correct and permissions are granted to user for connecting to the server.


440530 Started the fault monitor.

Description:

The fault monitor for this data service was started successfully.

Solution:

No action needed.


440792 Warning: some resources in resource group <%s> failed to start

Description:

The indicated resource group was pending online. One or more resources' START methods failed to execute successfully. Because the resources' Failover_mode is set to NONE, the resource group is moving to the ONLINE_FAULTED state rather than failing over to another node.

Solution:

This is a warning message, no user action is needed. The operator may choose to issue an scswitch(1M) command to try switching the affected resource group to another node or to try restarting it on the same node.


441826 "pmfadm -a" Action failed for <%s>

Description:

The given tag has exceeded the allowed number of retry attempts (given by the 'pmfadm -n' option) and the action (given by the 'pmfadm -a' option) was initiated by rpc.pmfd. The action failed (i.e., returned non-zero), and rpc.pmfd will delete this tag from its tag list and discontinue retry attempts.

Solution:

This message is informational; no user action is needed.


442053 clcomm: Invalid path_manager client_type (%d)

Description:

The system attempted to add a client of unknown type to the set of path manager clients.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


442281 reservation error(%s) - did_get_path() error in other_node_status()

Description:

The device fencing program has suffered an internal error.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available. Copies of /var/adm/messages from all nodes should be provided for diagnosis. It may be possible to retry the failed operation, depending on the nature of the error. If the message specifies the 'node_join' transition, then this node may be unable to access shared devices. If the failure occurred during the 'release_shared_scsi2' transition, then a node which was joining the cluster may be unable to access shared devices. In either case, it may be possible to reacquire access to shared devices by executing '/usr/cluster/lib/sc/run_reserve -c node_join' on all cluster nodes. If the failure occurred during the 'make_primary' transition, then a device group has failed to start on this node. If another node was available to host the device group, then it should have been started on that node. If desired, it may be possible to switch the device group to this node with the scswitch command. If no other node was available, then the device group will not have been started. The scswitch command may be used to retry the attempt to start the device group. If the failure occurred during the 'primary_to_secondary' transition, then the shutdown or switchover of a device group has failed. The desired action may be retried.


442767 Failed to stop SAP processes under PMF with SIGKILL.

Description:

Failed to stop SAP processes with Process Monitor Facility(PMF) with signal.

Solution:

This is an internal error. No user action needed. Save the /var/adm/messages from all nodes. Contact your authorized Sun service provider.


443271 clcomm: Pathend: Aborting node because %s for %u ms

Description:

The pathend aborted the node for the specified reason.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


443479 CMM: Quorum device %ld with gdevname %s has %d configured path - Ignoring mis-configured quorum device.

Description:

The specified number of configured paths to the specified quorum device is less than two, which is the minimum allowed. This quorum device will be ignored.

Solution:

Reconfigure the quorum device appropriately.


443746 resource %s state on node %s change to %s

Description:

This is a notification from the rgmd that a resource's state has changed. This may be used by system monitoring tools.

Solution:

This is an informational message, no user action is needed.


444001 %s: Call failed, return code=%d

Description:

A client was not able to make an rpc connection to a server (rpc.pmfd, rpc.fed or rgmd) to execute the action shown, and was not able to read the rpc error. The rpc error number is shown. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


444078 Cleaning up IPC facilities.

Description:

Sun Cluster is cleaning up the IPC facilities used by the application.

Solution:

This is an informational message, no user action is needed.


444144 clcomm: Cannot change increment

Description:

An attempt was made to change the flow control policy parameter that specifies the thread increment level. The flow control system uses this parameter to set the number of threads that are acted upon at one time. This value currently cannot be changed.

Solution:

No user action required.


445616 libsecurity: create of rpc handle to program %ld failed, will not retry

Description:

A client of the rpc.pmfd, rpc.fed or rgmd server was not able to initiate an rpc connection, after multiple retries. The maximum time allowed for connecting has been exceeded, or the types of rpc errors encountered indicate that there is no point in retrying. An accompanying error message shows the rpc error data. The pmfadm or scha command exits with error. The program number is shown. To find out what program corresponds to this number, use the rpcinfo command. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


446068 CMM: Node %s (nodeid = %ld) is down.

Description:

The specified node has gone down in that communication with it has been lost.

Solution:

The cause of the failure should be resolved and the node should be rebooted if node failure is unexpected.


446249 Method <%s> on resource <%s>: authorization error.

Description:

An attempted method execution failed, apparently due to a security violation; this error should not occur. This failure is considered a method failure. Depending on which method was being invoked and the Failover_mode setting on the resource, this might cause the resource group to fail over or move to an error state.

Solution:

Examine other syslog messages occurring at about the same time to see if the problem can be diagnosed. Save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance in diagnosing the problem.


447578 Duplicated installed nodename when Resource Type <%s> is added.

Description:

User has defined duplicated installed node name when creating resource type.

Solution:

Recheck the installed nodename list and make sure there is no nodename duplication.


447872 fatal: Unable to reserve %d MBytes of swap space; exiting

Description:

The rgmd was unable to allocate a sufficient amount of memory upon starting up. This is a fatal error. The rgmd will produce a core file and will force the node to halt or reboot to avoid the possibility of data corruption.

Solution:

Make sure that the hardware configuration meets documented minimum requirements. Save a copy of the /var/adm/messages files on all nodes, and of the rgmd core file. Contact your authorized Sun service provider for assistance in diagnosing the problem.


448703 clcomm: validate_policy: high too small. high %d low %d nodes %d pool %d

Description:

The system checks the proposed flow control policy parameters at system startup and when processing a change request. The high server thread level must be large enough to grant the low number of threads to all of the nodes identified in the message for a fixed size resource pool.

Solution:

No user action required.


448844 clcomm: inbound_invo::done: state is 0x%x

Description:

The internal state describing the server side of a remote invocation is invalid when the invocation completes server side processing.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


448898 %s.nodes entry in the configuration file must be between 1 and %d.

Description:

Illegal value for a node number. Perhaps the system is not booted as part of the cluster.

Solution:

Make sure the node is booted as part of a cluster.


449159 clconf: No valid quorum_vote field for node %u

Description:

Found the quorum vote field being incorrect while converting the quorum configuration information into quorum table.

Solution:

Check the quorum configuration information.


449288 setgid: %s

Description:

The rpc.pmfd server was not able to set the group id of a process. The message contains the system error. The server does not perform the action requested by the client, and an error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


449336 setsid: %s

Description:

The rpc.pmfd or rpc.fed server was not able to set the session id, and the system error is shown. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


449344 setuid: %s

Description:

The rpc.pmfd server was not able to set the user id of a process. The message contains the system error. The server does not perform the action requested by the client, and an error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


449661 No permission for owner to write %s.

Description:

The owner of the file does not have write permission on it.

Solution:

Set the permissions on the file so the owner can write it.


449907 scvxvmlg error - mknod(%s) failed

Description:

The program responsible for maintaining the VxVM namespace was unable to access the global device namespace. If configuration changes were recently made to VxVM diskgroups or volumes, this node may be unaware of those changes. Recently created volumes may be unaccessible from this node.

Solution:

Verify that the /global/.devices/node@N (N = this node's node number) is mounted globally and is accessible. If no configuration changes have been recently made to VxVM diskgroups or volumes and all volumes continue to be accessible from this node, then no further action is required. If changes have been made, the device namespace on this node can be updated to reflect those changes by executing '/usr/cluster/lib/dcs/scvxvmlg'. If the problem persists, contact your authorized Sun service provider to determine whether a workaround or patch is available.


449979 ALL the daemons are running on $HOSTNAME

Description:

This is just an informational message from the BV probethat all the daemons are running.

Solution:

No action required.


450173 Error accessing policy

Description:

This message appears when the customer is initializing or changing a scalable services load balancer, by starting or updating a service. The Load_Balancing_Policy is missing.

Solution:

Add a Load_Balancing_Policy parameter when creating the resource group.


450780 Error: Unable to create scha_control timestamp file <%s> for resource <%s>

Description:

The rgmd has failed in an attempt to create a file used for the anti-"pingpong" feature. This may prevent the anti-pingpong feature from working, which may permit a resource group to fail over repeatedly between two or more nodes. The failure to create the file might indicate a more serious problem on the node.

Solution:

Examine other syslog messages occurring around the same time on the same node, to see if the source of the problem can be identified.


451315 Error retrieving the extension property %s: %s.

Description:

An error occured reading the indicated extension property.

Solution:

Check syslog messages for errors logged from other system modules. If error persists, please report the problem.


451640 tag %s: stat of command file %s failed

Description:

The rpc.fed server checked the command path indicated by the tag, and this check failed, possibly because the path is incorrect. An error message is output to syslog.

Solution:

Check the path of the command.


452150 Failed to start the fault monitor.

Description:

Process monitor facility has failed to start the fault monitor.

Solution:

Check whether the system is low in memory or the process table is full and correct these problems. If the error persists, use scswitch to switch the resource group to another node.


452202 clcomm: sdoor_sendstream::send

Description:

This operation should never occur.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


452205 Failed to form the %s command.

Description:

The method searches the commands input to the Agent Builder for the occurrence of specific Builder defined variables, e.g. $hostnames, and replaces them with appropriate value. This action failed.

Solution:

Check syslog messages and correct the problems specified in prior syslog messages. If the error still persists, please report this problem.


452279 CMM: Retry of initialization for quorum device %s was successful.

Description:

This node was fenced off from the quorum device while it was booting, so the initial attempt to access the device returned EACCES. When the access was retried, it was successful.

Solution:

This is an informational message, no user action is needed.


452550 scds_syslog_buf


452552 Extension property <%s> has a value of <%s>

Description:

The property is set to the indicated value.

Solution:

This message is informational; no user action is needed.


452604 CMM: Registered key on and acquired quorum device %ld (gdevname %s).

Description:

When this node was booting up, it had found only non-cluster member keys on the specified device. After joining the cluster and having its CCR recovered, this node has been able to register its keys on this device and is its owner.

Solution:

This is an informational message, no user action is needed.


453207 some BV processes are still running on $HOSTNAME

Description:

This is just an informational message that all the BV processescould not be stopped and some BV processe are still running.

Solution:

No user action required. The service method will try tostop the BV processes again.


453919 Pathprefix is not set for resource group %s.

Description:

Resource Group property Pathprefix is not set.

Solution:

Use scrgadm to set the Pathprefix property on the resource group.


454247 Error: Unable to create directory <%s> for scha_control timestamp file

Description:

The rgmd is unable to access the directory used for the anti-"pingpong" feature, and cannot create the directory (which should already exist). This may prevent the anti-pingpong feature from working, which may permit a resource group to fail over repeatedly between two or more nodes. The failure to access or create the directory might indicate a more serious problem on the node.

Solution:

Examine other syslog messages occurring around the same time on the same node, to see if the source of the problem can be identified.


454607 INTERNAL ERROR: Invalid resource extension property type <%d> on resource <%s>; aborting node

Description:

An attempted creation or update of a resource has failed because of invalid resource type data. This may indicate CCR data corruption or an internal logic error in the rgmd. The rgmd will produce a core file and will force the node to halt or reboot.

Solution:

Use scrgadm(1M) -pvv to examine resource properties. If the resource or resource type properties appear to be corrupted, the CCR might have to be rebuilt. If values appear correct, this may indicate an internal error in the rgmd. Re-try the creation or update operation. If the problem recurs, save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance.


454930 Scheduling class %s not configured

Description:

An attempt to change the thread scheduling class failed, because the scheduling class was not configured.

Solution:

Configure the system to support the desired thread scheduling class.


456853 %s can't DOWN

Description:

This means that the Logical IP address could not be set to DOWN.

Solution:

There could be other related error messages which might be helpful. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


457114 fatal: death_ff->arm failed

Description:

The daemon specified in the error tag was unable to arm the failfast device. The failfast device kills the node if the daemon process dies either due to hitting a fatal bug or due to being killed inadvertently by an operator. This is a requirement to avoid the possibility of data corruption. The daemon will produce a core file and will cause the node to halt or reboot

Solution:

Save a copy of the /var/adm/messages files on all nodes, and of the core file generated by the daemon. Contact your authorized Sun service provider for assistance in diagnosing the problem.


457121 Failed to retrieve the host information for %s: %s.

Description:

The data service failed to retrieve the host information.

Solution:

If the logical hostname and shared address entries are specified in the /etc/inet/hosts file, check that the entries are correct. Verify the settings in the /etc/nsswitch.conf file include "files" for host lookup. If these are correct, check the health of the name server. For more error information, check the syslog messages.


458091 CMM: Reconfiguration delaying for %d seconds to allow larger partitions to win race for quorum devices.

Description:

In the case of potential split brain scenarios, the CMM allows larger partitions to win the race to acquire quorum devices by forcing the smaller partitions to sleep for a time period proportional to the number of nodes not in that partition.

Solution:

This is an informational message, no user action is needed.


458373 fatal: cannot create thread to notify President of state changes

Description:

The rgmd was unable to create a thread upon starting up. This is a fatal error. The rgmd will produce a core file and will force the node to halt or reboot to avoid the possibility of data corruption.

Solution:

Make sure that the hardware configuration meets documented minimum requirements. Save a copy of the /var/adm/messages files on all nodes, and of the rgmd core file. Contact your authorized Sun service provider for assistance in diagnosing the problem.


458530 Method <%s> on resource <%s>: program file is not executable.

Description:

A method pathname points to a file that is not executable. This may have been caused by incorrect installation of the resource type.

Solution:

Identify registered resource type methods using scrgadm(1M) -pvv. Check the permissions on the resource type methods. Reinstall the resource type if necessary, following resource type documentation.


458818 reservation fatal error(%s) - disk_file not specified

Description:

The device fencing program has suffered an internal error.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available. Copies of /var/adm/messages from all nodes should be provided for diagnosis. It may be possible to retry the failed operation, depending on the nature of the error. If the message specifies the 'node_join' transition, then this node may be unable to access shared devices. If the failure occurred during the 'release_shared_scsi2' transition, then a node which was joining the cluster may be unable to access shared devices. In either case, it may be possible to reacquire access to shared devices by executing '/usr/cluster/lib/sc/run_reserve -c node_join' on all cluster nodes. If the failure occurred during the 'make_primary' transition, then a device group has failed to start on this node. If another node was available to host the device group, then it should have been started on that node. If desired, it may be possible to switch the device group to this node with the scswitch command. If no other node was available, then the device group will not have been started. The scswitch command may be used to retry the attempt to start the device group. If the failure occurred during the 'primary_to_secondary' transition, then the shutdown or switchover of a device group has failed. The desired action may be retried.


458880 event lacking correct names

Description:

Solution:


458988 libcdb: scha_cluster_open failed with %d

Description:

Call to initialize a handle to get cluster information failed. The second part of the message gives the error code.

Solution:

The calling program should handle this error. If it is not recoverable, it will exit.


460027 Resource <%s> of Resource Group <%s> failed sanity check on node <%s>\n

Description:

Message logged for failed scha_control sanity check methods on specific node.

Solution:

No user action required.


460520 scvxvmlg fatal error - dcs_get_service_names_of_class(%s) failed, returned %d

Description:

The program responsible for maintaining the VxVM namespace has suffered an internal error. If configuration changes were recently made to VxVM diskgroups or volumes, this node may be unaware of those changes. Recently created volumes may be unaccessible from this node.

Solution:

If no configuration changes have been recently made to VxVM diskgroups or volumes and all volumes continue to be accessible from this node, then no action is required. If changes have been made, the device namespace on this node can be updated to reflect those changes by executing '/usr/cluster/lib/dcs/scvxvmlg'. If the problem persists, contact your authorized Sun service provider to determine whether a workaround or patch is available.


460635 <%s> operation failed: %s.


461776 %s failed: %s.


462083 fatal: Resource <%s> update failed with error <%d>; aborting node

Description:

Rgmd failed to read updated resource from the CCR on this node.

Solution:

Save a copy of the /var/adm/messages files on all nodes, and of the rgmd core file. Contact your authorized Sun service provider for assistance in diagnosing the problem.


462632 HA: repl_mgr: exception invalid_repl_prov_state %d

Description:

The system did not perform this operation on the primary object.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


463953 %s failed to complete

Description:

The command failed.

Solution:

Check the syslog and /var/adm/messages for more details.


464588 Failed to retreive the resource group property %s: %s.

Description:

An API operation has failed while retrieving the resource group property. Low memory or API call failure might be the reasons.

Solution:

In case of low memory, the problem will probably cured by rebooting. If the problem reoccurs, you might need to increase swap space by configuring additional swap devices. Otherwise, if it is API call failure, check the syslog messages from other components. For resource group name and the property name, check the current syslog message.


465065 Error accessing group

Description:

This message appears when the customer is initializing or changing a scalable services load balancer, by starting or updating a service. The specified resource group is invalid.

Solution:

Check the resource group name specified and make sure that a valid value is used.


466896 Could not create file %s: %s.

Description:

Failed to create file.

Solution:

Check whether the permissions are valid. This might be the result of a lack of system resources. Check whether the system is low in memory and take appropriate action.


468477 Failed to retrieve the property %s: %s.

Description:

API operation has failed in retrieving the cluster property.

Solution:

For property name, check the syslog message. For more details about API call failure, check the syslog messages from other components.


468477 Failed to retrieve the property %s: %s.

Description:

An internal error occurred in the rgmd while checking a cluster property.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


468732 Too many modules configured for autopush.

Description:

The system attempted to configure a clustering STREAMS module for autopush but too many modules were already configured.

Solution:

Check in your /etc/iu.ap file if too many modules have been configured to be autopushed on a network adapter. Reduce the number of modules. Use autopush(1m) command to remove some modules from the autopush configuration.


469417 Failfast: timeout - unit \"%s\"%s.

Description:

A failfast client has encountered a timeout and is going to panic the node.

Solution:

There may be other related messages on this node which may help diagnose the problem. Resolve the problem and reboot the node if node panic is unexpected.


469817 Specified resource group does not exist: %s.

Description:

The name of a specified resource group is invalid. Such a resource group does not exist.

Solution:

This probably is the result of specifying an incorrect resource group name in an dependency, or an extension property of a resource, or resource group. Please repeat the steps which led to this error using an existing resource group name.


471241 Probing SAP Message Server times out with command %s.

Description:

Checking SAP message server with utility lgtst times out. This may happen under heavy system load.

Solution:

You might consider increasing the Probe_timeout property. Try switching the resource group to another node using scswitch (1M).


471788 Unable to resolve hostname %s

Description:

Solution:


472185 Failed to retrieve the resource group property %s for %s: %s.

Description:

The query for a property failed. The reason for the failure is given in the message.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


473021 in libsecurity uname sys call failed: %s

Description:

A client was not able to make an rpc connection to a server (rpc.pmfd, rpc.fed or rgmd) because the host name could not be obtained. The system error message is shown. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


473460 Method <%s> on resource <%s>: authorization error: %s.

Description:

An attempted method execution failed, apparently due to a security violation; this error should not occur. The last portion of the message describes the error. This failure is considered a method failure. Depending on which method was being invoked and the Failover_mode setting on the resource, this might cause the resource group to fail over or move to an error state.

Solution:

Correct the problem identified in the error message. If necessary, examine other syslog messages occurring at about the same time to see if the problem can be diagnosed. Save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance in diagnosing the problem.


473653 Failed to retrieve the resource type handle for %s while querying for property %s: %s.

Description:

Access to the object named failed. The reason for the failure is given in the message.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


474256 Validations of all specified global device services complete.

Description:

All device services specified directly or indirectly via the GlobalDevicePath and FilesystemMountPoint extension properties respectively are found to be correct. Other Sun Cluster components like DCS, DSDL, RGM are found to be in order. Specified file system mount point entries are found to be correct.

Solution:


474690 clexecd: Error %d from send_fd

Description:

clexecd program has encountered a failed fcntl(2) system call. The error message indicates the error number for the failure.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


475303 Could not start BV 121 daemons on $HOSTNAME.

Description:

The Broadvision servers could not be started on the specified host. This could happen if the orbixd didnot get started properly or if there are any configurationerrors.

Solution:

See if there are any internal errors. Look at the Broadvisionconfiguration and check if everything is OK. Try to startBV on the specified host manually and check if it can bestarted properly. If orbixd could be started manually butcouldnt be started under HA contact sun support with/var/adm/messages.


475398 Out of memory (memory allocation failed):%s.%s

Description:

There is not enough swap space on the system.

Solution:

Add more swap space. See swap(1M) for more details.


475888 tcp_transport could not modify rio_threadpoolAsked for %d, got only %d


476157 Failed to get the pmf_status. Error: %s.

Description:

A method could not obtain the status of the service from PMF. The specific cause for the failure may be logged with the message.

Solution:

Look in /var/adm/messages for the cause of failure. Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


477296 Validation failed. SYBASE ASE STOP_FILE %s not found.

Description:

File specified in the STOP_FILE extension property was not found. Or the file specified is not an ordinary file.

Solution:

Please check that file specified in the STOP_FILE extension property exists on all the nodes.


477365 flock(lockfile) failed: %s.


477378 Failed to restart the service.

Description:

Restart attempt of the data service is failed.

Solution:

Check the sylog messages that are occurred just before this message to check whether there is any internal error. In case of internal error, contact your Sun service provider. Otherwise, any of the following situations may have happened. 1) Check the Start_timeout and Stop_timeout values and adjust them if they are not appropriate. 2) This might be the result of lack of the system resources. Check whether the system is low in memory or the process table is full and take appropriate action.


477816 clexecd: priocntl returned %d. Exiting.

Description:

clexecd program has encountered a failed priocltl(2) system call. The error message indicates the error number for the failure.

Solution:

clexecd program will exit and node will be halted or rebooted to prevent data corruption. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


478523 Could not mount '%s' because there was an error (%d) in opening the directory.

Description:

While mounting a Cluster file system, the directory on which the mount is to take place could not be opened.

Solution:

Fix the reported error and retry. The most likely problem is that the directory does not exist - in that case, create it with the appropriate permissions and retry.


478801 Unable to open %s: %s.


479105 Can not get service status for global service <%s> of path <%s>

Description:

Can not get status for the global service. This is a severe problem.

Solution:

Contact your authorized Sun service provider to determine what is the cause of the problem.


479184 Failed to signal cl_apid.

Description:

Solution:


479213 Monitor server terminated.

Description:

Graceful shutdown did not succeed. Monitor server processes were killed in STOP method. It is likely that adaptive server terminated prior to shutdown of monitor server.

Solution:

Please check the permissions of file specified in the STOP_FILE extension property. File should be executable by the Sybase owner and root user.


479432 The application process tree has died and the action to be taken as determined by scds_fm_history is to failover. However the application is not being failed over because the failover_enabled extension property is set to false. Restarting the application instead.

Description:

Property failover_enabled is set to false. The probe is trying to restart application locally, instead of failover.

Solution:

This is an informational message, no user action is needed.


479442 in libsecurity could not allocate memory

Description:

A server (rpc.pmfd, rpc.fed or rgmd) was not able to start, or a client was not able to make an rpc connection to the server, probably due to low memory. An error message is output to syslog.

Solution:

Investigate if the host is low on memory. If not, save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


481813 Mismatched NAFO group callback %s CCR %s.

Description:

Solution:


482531 IPMP group %s has unknown status %d. Skipping this IPMP group.

Description:

The status of the IPMP group is not among the set of statuses that is known.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


482901 Can't allocate binding element

Description:

Client affinity state on the node has become incomplete due to unexpected memory shortage. New connections from some clients that have existing connections with this node might go to a different node as a result.

Solution:

If client affinity is a requirement for some of the sticky services, say due to data integrity reasons, these services must be brought offline on the node, or the node itself should be restarted.


482909 Failed to parse xml: nvpairs without subclass

Description:

Solution:


483160 Failed to connect to socket: %s.

Description:

While determining the health of the resource, process monitor facility has failed to communicate with the resource fault monitor.

Solution:

Any of the following situations might have occured. 1) Check whether the fault monitor is running, if not wait for the fault monitor to start. 2) Check whether the fault monitor is disabled, if it is then user can enable the fault monitor, otherwise ignore it. 3) In all other situations, consider it as an internal error. Save /var/adm/messages file and contact your authorized Sun service provider. For more error description check the syslog messages.


483528 NULL value returned for resource name.

Description:

A null value was returned for resource name.

Solution:

Check the resource name.


483858 Must set at least one of Port_List or Monitor_Uri_List.

Description:

When creating the resource a Port_List or Monitor_Uri_List must be specified.

Solution:

Run the resource creation again specifying either a Port_List or Monitor_Uri_List.


484084 INTERNAL ERROR: non-existent resource <%s> appears in dependency list of resource <%s>

Description:

While attempting to execute an operator-requested enable of a resource, the rgmd has found a non-existent resource to be listed in the Resource_dependencies or Resource_dependencies_weak property of the indicated resource. This suggests corruption of the RGM's internal data but is not fatal.

Solution:

Use scrgadm(1M) -pvv to examine resource group properties. If the values appear corrupted, the CCR might have to be rebuilt. If values appear correct, this may indicate an internal error in the rgmd. Contact your authorized Sun service provider for assistance in diagnosing and correcting the problem.


484513 Failed to retrieve the probe command with error <%d>. Will continue to do the simple probe.

Description:

The fault monitor failed to retrieve the probe command from the cluster configuration. It will continue using the simple probe to monitor the application.

Solution:

No action required.


485464 clcomm: Failed to allocate simple xdoor server %d

Description:

The system could not allocate a simple xdoor server. This can happen when the xdoor number is already in use. This message is only possible on debug systems.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


485759 transition '%s' failed for cluster '%s'

Description:

The mentioned state transition failed for the cluster. udlmctl will exit.

Solution:

This is an internal error. Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.


485942 (%s) sigprocmask failed: %s (UNIX errno %d)

Description:

Call to sigprocmask() failed. The "sigprocmask" man page describes possible error codes. udlmctl will exit.

Solution:

Save a copy of the /var/adm/messages files on all nodes, and of the rgmd core file. Contact your authorized Sun service provider for assistance in diagnosing the problem.


486841 SIOCGLIFCONF: %s

Description:

Solution:


487022 The networking components for scalable resource %s have been configured successfully for method %s.

Description:

The calls to the underlying scalable networking code succeeded.

Solution:

This is an informational message, no user action is needed.


487484 lkcm_reg: lib initialization failed

Description:

udlm could not register with cmm because lib initialization failed.

Solution:

This is an internal error. Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.


487574 Failed to alloc memory

Description:

A scha_control call failed because the system has run out of swap space. The system is likely to halt or reboot if swap space continues to be depleted.

Solution:

Investigate the cause of swap space depletion and correct the problem, if possible.


487778 RGM isn't failing resource group <%s> off of node <%d>, because there are no other current or potential masters

Description:

A scha_control(1HA,3HA) GIVEOVER attempt failed because no candidate node was healthy enough to host the resource group, and the resource group was not currently mastered by any other node.

Solution:

Examine other syslog messages on all cluster members that occurred about the same time as this message, to see why other candidate nodes were not heathly enough to master the resource group. Repair the condition that is preventing any potential master from hosting the resource group.


487827 CCR: Waiting for repository synchronization to finish.

Description:

This node is waiting to finish the synchronization of its repository with other nodes in the cluster before it can join the cluster membership.

Solution:

This is an informational message, generally no user action is needed. If all the nodes in the cluster are hanging at this message for a long time, look for other messages. The possible cause is the cluster hasn't obtained quorum, or there is CCR metadata missing or invalid. If the cluster is hanging due to missing or invalid metadata, the ccr metadata needs to be recovered from backup.


488276 in libsecurity write of file %s failed: %s

Description:

The rpc.pmfd, rpc.fed or rgmd server was not able to write to a cache file for rpcbind information. The affected component should continue to function by calling rpcbind directly.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


488980 HTTP GET Response Code for probe of %s is %d. Failover will be in progress

Description:

The status code of the response to a HTTP GET probe that indicates the HTTP server has failed. It will be restarted or failed over.

Solution:

This message is informational; no user action is needed.


488988 Unable to open libxml2.so.

Description:

Solution:


489069 Extension property <Failover_enabled> is not defined, using the default value of TRUE.

Description:

Property failover_enabled is not be defined in RTR file. A value of TRUE is being used as default.

Solution:

This is an informational message, no user action is needed.


489438 clcomm: Path %s being drained

Description:

A communication link is being removed with another node. The interconnect may have failed or the remote node may be down.

Solution:

Any interconnect failure should be resolved, and/or the failed node rebooted.


489644 Could not look up host because host was NULL.

Description:

Can't look up the hostname locally in hostfile. The specified host name is invalid.

Solution:

Check whether the hostname has NULL value. If this is the case, recreate the resource with valid host name. If this is not the reason, treat it as an internal error and contact Sun service provider.


491081 resource %s removed.

Description:

This is a notification from the rgmd that the operator has deleted a resource. This may be used by system monitoring tools.

Solution:

This is an informational message, no user action is needed.


491579 clcomm: validate_policy: fixed size pool low %d must match moderate %d

Description:

The system checks the proposed flow control policy parameters at system startup and when processing a change request. The low and moderate server thread levels must be the same for fixed size resource pools.

Solution:

No user action required.


491694 Could not %s any ip addresses.

Description:

The specified action was not successful for all ip addresses managed by the LogicalHostname resource.

Solution:

Check the logs for any error messages from pnm. This could be result from the lack of system resources, such as low on memory. Reboot the node if the problem persists.


491738 Local node failed to do affinity switchover to global service <%s> of path <%s>: %s

Description:

When prenet_start method of SUNW.HAStorage attempted an affinity switch, it failed.

Solution:

The affinity switchover may have failed due to an equivalent switchover having been in progress at the time. The service may indeed have successfully come online later during boot. Use the scstat (1M) -g command to verify service availability and scstat(1M) -D to identify primary server. If the service state does not reflect expected configuration, retry the affinity switchover via scswitch(1M).


492603 launch_fed_prog: fe_method_full_name() failed for program <%s>

Description:

The ucmmd was unable to assemble the full method pathname for the fed program to be launched. This is considered a launch_fed_prog failure.

Solution:

Examine other syslog messages occurring at about the same time to see if the problem can be identified. Save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance in diagnosing and correcting the problem.


492953 ORACLE_HOME/bin/lsnrctl not found ORACLE_HOME=%s

Description:

Oracle listener binaries not found under ORACLE_HOME. ORACLE_HOME specified for the resource is indicated in the message. HA-Oracle will not be able to manage Oracle listener if ORACLE_HOME is incorrect.

Solution:

Specify correct ORACLE_HOME when creating resource. If resource is already created, please update resource property 'ORACLE_HOME'.


492781 Retrying to retrieve the resource information: %s.

Description:

An update to cluster configuration occured while resource properties were being retrieved

Solution:

This is only an informational message.


493657 Unable to get status for IPMP group %s.

Description:

The specified IPMP group is not in functional state. Logical host resource can't be started without a functional IPMP group.

Solution:

LogicalHostname resource will not be brought online on this node. Check the messages(pnmd errors) that encountered just before this message for any IPMP or adapter problem. Correct the problem and rerun the scrgadm.


494534 clcomm: per node IP config %s%d:%d (%d): %d.%d.%d.%d failed with %d

Description:

The system failed to configure IP communications across the private interconnect of this device and IP address, resulting in the error identified in the message. This happened during initialization. Someone has used the "lo0:1" device before the system could configure it.

Solution:

If you used "lo0:1", please use another device. Otherwise, Contact your authorized Sun service provider to determine whether a workaround or patch is available.


494563 "pmfctl -S": Error suspending pid %d for tag <%s>: %d

Description:

An error occured while rpc.pmfd attempted to suspend the monitoring of the indicated pid, possibly because the indicated pid has exited while attempting to suspend its monitoring.

Solution:

Check if the indicated pid has exited, if this is not the case, Save the syslog messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


494913 pmfd: unknown action (0x%x)

Description:

An internal error has occured in the rpc.pmfd server. This should not happen.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


495284 dl_attach: DLPI error %u

Description:

Could not attach to the physical device. We are trying to open a fast path to the private transport adapters.

Solution:

Reboot of the node might fix the problem.


495386 INTERNAL ERROR: %s.

Description:

An internal error has occurred.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


495529 Prog <%s> failed to execute step <%s> - <%s>

Description:

ucmmd failed to execute a step.

Solution:

Examine other syslog messages occurring at about the same time to see if the problem can be identified and if it recurs. Save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance.


495710 Stopping oracle server using shutdown immediate

Description:

Informational message. Oracle server will be stopped using 'shutdown immediate' command.

Solution:

None


496746 reservation error(%s) - USCSI_RESET failed for device %s, returned%d

Description:

The device fencing program has encountered errors while trying to access a device. All retry attempts have failed.

Solution:

This may be indicative of a hardware problem, which should be resolved as soon as possible. Once the problem has been resolved, the following actions may be necessary: If the message specifies the 'node_join' transition, then this node may be unable to access the specified device. If the failure occurred during the 'release_shared_scsi2' transition, then a node which was joining the cluster may be unable to access the device. In either case, access can be reacquired by executing '/usr/cluster/lib/sc/run_reserve -c node_join' on all cluster nodes. If the failure occurred during the 'make_primary' transition, then a device group may have failed to start on this node. If the device group was started on another node, it may be moved to this node with the scswitch command. If the device group was not started, it may be started with the scswitch command. If the failure occurred during the 'primary_to_secondary' transition, then the shutdown or switchover of a device group may have failed. If so, the desired action may be retried.


496884 Despite the warnings, the validation of the hostname list succeeded.

Description:

While validating the hostname list, non fatal errors have been found.

Solution:

This is informational message. It is suggested to correct the errors if applicable. For the error information, check the syslog messages that have been encountered before this message.


496991 BV Config Error:IMs not configured on either the physical or the private interconnect.

Description:

The Interaction Managers are not configured on either the physical node or on the cluster private node.

Solution:

Reconfigure the Interaction Managers on a physical host or ona cluster private IP. Refer to the HA-BV installation andconfiguration giude.


497795 gethostbyname() timed out.

Description:

The name service could be unavailable.

Solution:

If the cluster is under load or too much network traffic, increase the timeout value of monitor_check method using scrgadm command. Otherwise, check if name service is configured correctly. Try some commands to query name serves, such as ping and nslookup, and correct the problem. If the error still persists, then reboot the node.


498582 Attempt to load %s failed: %s.

Description:

A shared address resource was in the process of being created. In order to prepare this node to handle scalable services, the specified kernel module was attempted to be loaded into the system, but failed.

Solution:

This might be the result from the lack of system resources. Check whether the system is low in memory and take appropriate action (e.g., by killing hung processes). For specific information check the syslog message. After more resources are available on the system , attempt to create shared address resource. If problem persists, reboot.


498711 Could not initialize the ORB. Exiting.

Description:

clexecd program was unable to initialize its interface to the low-level clustering software.

Solution:

This might occur because the operator has attempted to start clexecd program on a node that is booted in non-cluster mode. If the node is in non-cluster mode, boot it into cluster mode. If the node is already in cluster mode, contact your authorized Sun service provider to determine whether a workaround or patch is available.


498909 accept: %s

Description:

Solution:


499150 Failed to start service.


499290 Malformed cmd string.


499486 Unable to set socket flags: %s.

Description:

Failed to set the non-blocking flag for the socket used in communicating with the application.

Solution:

This is an internal error, no user action is required. Also contact your authorized Sun service provider.


499756 CMM: Node %s: joined cluster.

Description:

The specified node has joined the cluster.

Solution:

This is an informational message, no user action is needed.


499775 resource group %s added.

Description:

This is a notification from the rgmd that a new resource group has been added. This may be used by system monitoring tools.

Solution:

This is an informational message, no user action is needed.


499802 Successfully started BV daemons on $HOSTNAME.

Description:

This is just an informational message that the BV processes onthe specified host have started successfully.

Solution:

No action needed.

Message IDs 500000–599999


500568 fatal: Aborting this node because method <%s> on resource <%s> is unkillable

Description:

The specified callback method for the specified resource became stuck in the kernel, and could not be killed with a SIGKILL. The RGM reboots the node to force the data service to fail over to a different node, and to avoid data corruption.

Solution:

No action is required. This is normal behavior of the RGM. Other syslog messages that occurred just before this one might indicate the cause of the method failure.


501582 in libsecurity setnetpath failed: %s

Description:

The rpc.pmfd, rpc.fed or rgmd server was not able to initiate an rpc connection, because it could not get the network database handle. The server does not start. The rpc error message is shown. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


501632 Incorrect syntax in the environment file %s. Ignoring %s

Description:

HA-Oracle reads the fle specified in USER_ENV property and exports the variables declared in the file. Syntax for declaring the variables is : VARIABLE=VALUE Lines starting with ' VARIABLE is expected to be a valid Korn shell variable that starts with alphabet or '_' and contains alphanumerics and '_'.

Solution:

Please check the environment file and correct the syntax errors. Do not use export statement in environment file.


501733 scvxvmlg fatal error - _cladm() failed

Description:

The program responsible for maintaining the VxVM namespace has suffered an internal error. If configuration changes were recently made to VxVM diskgroups or volumes, this node may be unaware of those changes. Recently created volumes may be unaccessible from this node.

Solution:

If no configuration changes have been recently made to VxVM diskgroups or volumes and all volumes continue to be accessible from this node, then no action is required. If changes have been made, the device namespace on this node can be updated to reflect those changes by executing '/usr/cluster/lib/dcs/scvxvmlg'. If the problem persists, contact your authorized Sun service provider to determine whether a workaround or patch is available.


501763 recv_request: t_alloc: %s

Description:

Call to t_alloc() failed. The "t_alloc" man page describes possible error codes. udlm will exit and the node will abort.

Solution:

This is an internal error. Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.


501917 process_intention(): IDL exception when communicating to node %d

Description:

An inter-node communication failed, probably because a node died.

Solution:

No action is required; the rgmd should recover automatically.


501981 Valid connection attempted from %s: %s

Description:

Solution:


502022 fatal: joiners_read_ccr: exiting early because of unexpected exception

Description:

The low-level cluster machinery has encountered a fatal error. The rgmd will produce a core file and will cause the node to halt or reboot.

Solution:

Save a copy of the /var/adm/messages files on all nodes, and of the rgmd core file. Contact your authorized Sun service provider for assistance in diagnosing the problem.


502438 IPMP group %s has status %s.

Description:

The specified IPMP group is not in functional state. Logical host resource can't be started without a functional IPMP group.

Solution:

LogicalHostname resource will not be brought online on this node. Check the messages(pnmd errors) that encountered just before this message for any IPMP or adapter problem. Correct the problem and rerun the scrgadm command.


503048 NULL value returned for resource group property %s.

Description:

NULL value was returned for resource group property.

Solution:

For the property name check the syslog message. Any of the following situations might have occurred. Different user action is needed for these 1) If a new resource group is created or updated, check whether the value of the property is valid. 2) For all other cases, treat it as an Internal error.


503064 Method <%s> on resource <%s>: Method timed out.

Description:

A VALIDATE method execution has exceeded its configured timeout and was killed by the rgmd. This in turn will cause the failure of a creation or update operation on a resource or resource group.

Solution:

Consult resource type documentation to diagnose the cause of the method failure. Other syslog messages occurring just before this one might indicate the reason for the failure. After correcting the problem that caused the method to fail, the operator may retry the resource group update operation.


503399 Failed to parse xml: invalid attribute %s

Description:

Solution:


503771 reservation warning(%s) - MHIOCGRP_REGISTERANDIGNOREKEY error will retry in %d seconds

Description:

The device fencing program has encountered errors while trying to access a device. The failed operation will be retried

Solution:

This is an informational message, no user action is needed.


503817 No PDT Dispatcher thread.

Description:

The system has run out of resources that is required to create a thread. The system could not create the connection processing thread for scalable services.

Solution:

If cluster networking is required, add more resources (most probably, memory) and reboot.


503871 %s: returned from function svc_run().

Description:

RPC service has encountered an error, causing it to return from svc_run().


504363 ERROR: process_resource: resource <%s> is pending_update but no UPDATE method is registered

Description:

A non-fatal internal error has occurred in the rgmd state machine.

Solution:

Since this problem might indicate an internal logic error in the rgmd, please save a copy of the /var/adm/messages files on all nodes, the output of an scstat -g command, and the output of a scrgadm -pvv command. Report the problem to your authorized Sun service provider.


504402 CMM: Aborting due to stale sequence number. Received a message from node %ld indicating that node %ld has a stale sequence

Description:

After receiving a message from the specified remote node, the local node has concluded that it has stale state with respect to the remote node, and will therefore abort. The state of a node can get out-of-date if it has been in isolation from the nodes which have majority quorum.

Solution:

Reboot the node.


505101 Found another active instance of clexecd. Exiting daemon_process.

Description:

An active instance of clexecd program is already running on the node.

Solution:

This would usually happen if the operator tries to start the clexecd program by hand on a node which is booted in cluster mode. If thats not the case, contact your authorized Sun service provider to determine whether a workaround or patch is available.


506740 Home directory is not set for user %s.

Description:

No home directory set for the specified Broadvision user.

Solution:

Set the home directory of the Broadvision User to pointto the directory containing the Broadvision config files.


508391 Error: failed to load catalog %s

Description:

Solution:


508671 mmap: %s

Description:

The rpc.pmfd server was not able to allocate shared memory for a semaphore, possibly due to low memory, and the system error is shown. The server does not perform the action requested by the client, and pmfadm returns error. An error message is also output to syslog.

Solution:

Investigate if the machine is running out of memory. If this is not the case, save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


508687 monitor_check: method <%s> failed on resource <%s> in resource group <%s> on node <%s>, exit code <%d>

Description:

In scha_control, monitor_check method of the resource failed on specific node.

Solution:

No action is required, this is normal phenomenon of scha_control, which launches the corresponding monitor_check method of the resource on all candidate nodes and looks for a healthy node which passes the test. If a healthy node is found, scha_control will let the node take over the resource group. Otherwise, scha_control will just exit early.


509069 CMM: Halting because this node has no configuration info about node %ld which is currently configured in the cluster and running.

Description:

The local node has no configuration information about the specified node. This indicates a misconfiguration problem in the cluster. The /etc/cluster/ccr/infrastructure table on this node may be out of date with respect to the other nodes in the cluster.

Solution:

Correct the misconfiguration problem or update the infrastructure table if out of date, and reboot the nodes. To update the table, boot the node in non-cluster (-x) mode, restore the table from the other nodes in the cluster or backup, and boot the node back in cluster mode.


509136 Probe failed.

Description:

Fault monitor was unable to perform complete health check of the service.

Solution:

1) Fault monitor would take appropriate action (by restarting or failing over the service.). 2) Data service could be under load, try increasing the values for Probe_timeout and Thorough_probe_interval properties. 3) If this problem continues to occur, look at other messages in syslog to determine the root cause of the problem. If all else fails reboot node.


510659 Failover %s data services must be in a failover resource group.

Description:

The Scalable resource property for the data service was set to FALSE, which indicates a failover resource, but the corresponding data service resource group is not a failover resource group. Failover resources of this resource type must reside in a failover resource group.

Solution:

Decide whether this resource is to be scalable or failover. If scalable, set the Scalable property value to TRUE. If failover, leave Scalable set to FALSE and create this resource in a failover resource group. A failover resource group has its resource group property RG_mode set to Failover.


511177 clcomm: solaris xdoor door_info failed

Description:

A door_info operation failed. Refer to the "door_info" man page for more information.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


511633 Unrecognized callback registration mode %s.


511749 rgm_launch_method: failed to get method <%s> timeout value from resource <%s>

Description:

Due to an internal error, the rgmd was unable to obtain the method timeout for the indicated resource. This is considered a method failure. Depending on which method was being invoked and the Failover_mode setting on the resource, this might cause the resource group to fail over or move to an error state.

Solution:

Examine other syslog messages occurring at about the same time to see if the problem can be identified. Save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance in diagnosing the problem.


511810 Property <%s> does not exist in SUNW.HAStorage

Description:

Property set in SUNW.HAStorage type resource is not defined in SUNW.HAStorage.

Solution:

Check /var/adm/message and see what property name is used. Correct it according to the definition in SUNW.HAStorage.


511917 clcomm: orbdata: unable to add to hash table

Description:

The system records object invocation counts in a hash table. The system failed to enter a new hash table entry for a new object type.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


512422 WARNING: unknown msg (type %d) was picked up by a lkcm_act, returning LKCM_NOOP

Description:

Warning for unknown message picked up during udlm state update.

Solution:

None.


513538 scvxvmlg error - mkdirp(%s) failed

Description:

The program responsible for maintaining the VxVM namespace was unable to access the global device namespace. If configuration changes were recently made to VxVM diskgroups or volumes, this node may be unaware of those changes. Recently created volumes may be unaccessible from this node.

Solution:

Verify that the /global/.devices/node@N (N = this node's node number) is mounted globally and is accessible. If no configuration changes have been recently made to VxVM diskgroups or volumes and all volumes continue to be accessible from this node, then no further action is required. If changes have been made, the device namespace on this node can be updated to reflect those changes by executing '/usr/cluster/lib/dcs/scvxvmlg'. If the problem persists, contact your authorized Sun service provider to determine whether a workaround or patch is available.


514047 select: %s

Description:

Solution:


514688 Invalid port number %s in the %s property.

Description:

The specified system property does not have a valid port number.

Solution:

Using scrgadm(1M), specify the positive, valid port number.


514731 Failed to kill listener process for %s

Description:

Failed to kill listener processes.

Solution:

None


515583 %s is not a valid IP address.

Description:

Validation method has failed to validate the ip addresses. The mapping for the given ip address in the local host files can't be done: the specified ip address is invalid.

Solution:

Invalid hostnames/ip addresses have been specified while creating the resource. Recreate the resource with valid hostnames.


516407 reservation warning(%s) - MHIOCGRP_INKEYS error will retry in %d seconds

Description:

The device fencing program has encountered errors while trying to access a device. The failed operation will be retried

Solution:

This is an informational message, no user action is needed.


517009 lkcm_act: invalid handle was passed %s %d

Description:

Handle for communication with udlmctl is invalid.

Solution:

This is an internal error. Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.


517036 connection from outside the cluster - rejected

Description:

There was a connection from an IP address which does not belong to the cluster. This is not allowed so the PNM daemon rejected the connection.

Solution:

This message is informational; no user action is needed. However, it would be a good idea to see who is trying to talk to the PNM daemon and why?


517343 clexecd: Error %d from pipe

Description:

clexecd program has encountered a failed pipe(2) system call. The error message indicates the error number for the failure.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


517363 clconf: Unrecognized property type

Description:

Found the unrecognized property type in the configuration file.

Solution:

Check the configuration file.


518018 CMM: Node being aborted from the cluster.

Description:

This node is being excluded from the cluster.

Solution:

Node should be rebooted if required. Resolve the problem according to other messages preceding this message.


518291 Warning: Failed to check if scalable service group %s exists: %s.

Description:

A call to the underlying scalable networking code failed.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


519262 Validation failed. SYBASE monitor server startup file RUN_%s not found SYBASE=%s.

Description:

Monitor server was specified in the extension property Monitor_Server_Name. However, monitor server startup file was not found. Monitor server startup file is expected to be: $SYBASE/$SYBASE_ASE/install/RUN_<Monitor_Server_Name>

Solution:

Check the monitor server name specified in the Monitor_Server_Name property. Verify that SYBASE and SYBASE_ASE environment variables are set property in the Environment_file. Verify that RUN_<Monitor_Server_Name> file exists.


520231 Unable to set the number of threads for the FED RPC threadpool.

Description:

The rpc.fed server was unable to set the number of threads for the RPC threadpool. This happens while the server is starting up.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


520384 %s action in bv_utils Failed.

Description:

The specified action failed to succeed. There can be several reasons for the failure(wrong configuration,orbixd not running,non existant config files,couldnt start BV processes,Couldnt stopBV servers etc....)

Solution:

Look for other syslog messages to get the exact failure location. If it is a Broadvision configuration error,try to run it manuallyand see if everything is OK. If manually everything is OK but under HA if the error is occuring,contact sun support withthe /var/adm/messages and BV logs.


520982 CMM: Preempting node %ld from quorum device %s failed with error %d.

Description:

This node was unable to preempt the specified node from the quorum device, indicating that the partition to which the local node belongs has been preempted and will abort. If a cluster gets divided into two or more disjoint subclusters, exactly one of these must survive as the operational cluster. The surviving cluster forces the other subclusters to abort by grabbing enough votes to grant it majority quorum. This is referred to as preemption of the losing subclusters.

Solution:

There may be other related messages that may indicate why the partition to which the local node belongs has been preempted. Resolve the problem and reboot the node.


521393 Backup server stopped.

Description:

The backup server was stopped by Sun Cluster HA for Sybase.

Solution:

This is an information message, no user action is needed.may


521538 monitor_check: set_env_vars() failed for resource <%s>, resource group <%s>

Description:

During execution of a scha_control(1HA,3HA) function, the rgmd was unable to set up environment variables for method execution, causing a MONITOR_CHECK method invocation to fail. This in turn will prevent the attempted failover of the resource group from its current master to a new master.

Solution:

Examine other syslog messages occurring at about the same time to see if the problem can be identified. Save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance in diagnosing and correcting the problem.


521671 uri <%s> probe failed

Description:

The probing of the url set in the monitor_uri_list extension property failed. The agent probe will take action.

Solution:

None. The agent probe will take action. However, the cause of the failure should be investigated further. Examine the log file and syslog messages for additional information.


521918 Validation failed. Connect string is NULL

Description:

The 'Connect_String' extension property used for fault monitoring is null. This has the format 'username/password'.

Solution:

Check for syslog messages from other system modules. Check the resource configuration and the value of the 'Connect_string'property.


522480 RGM state machine returned error %d

Description:

An error has occurred on this node while attempting to execute the rgmd state machine.

Solution:

Examine other syslog messages occurring at about the same time to see if the problem can be identified. Save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance in diagnosing the problem.


522710 Failed to parse xml: invalid reg_type [%s]

Description:

Solution:


522779 IP address (hostname) string %s in property %s, entry %d could not be resolved to an IP address.

Description:

The IP address (hostname) string within the named property in the message did not resolve to a real IP address.

Solution:

Change the IP address (hostname) string within the entry in the property to one that does resolve to a real IP address. Make sure the syntax of the entry is correct.


523302 fatal: thr_keycreate: %s (UNIX errno %d)

Description:

The rgmd failed in a call to thr_keycreate(3T). The error message indicates the reason for the failure. The rgmd will produce a core file and will force the node to halt or reboot to avoid the possibility of data corruption.

Solution:

Save a copy of the /var/adm/messages files on all nodes, and of the rgmd core file. Contact your authorized Sun service provider for assistance in diagnosing the problem.


523643 INTERNAL ERROR: %s

Description:

An internal error has occurred.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


523933 Although there are no other potential masters, RGM is failing resource group <%s> off of node <%d> because there are other current healthy masters.

Description:

The resource group was brought OFFLINE on the node specified, probably because of a public network failure on that node. The operation was performed despite the lack of a healthy candidate node to host the resource group, because the resource group was currently mastered by at least one other healthy node.

Solution:

No action required. If desired, examine other syslog messages on the node in question to determine the cause of the network failure.


525197 No network address resources in resource group.

Description:

Solution:


525628 CMM: Cluster has reached quorum.

Description:

Enough nodes are operational to obtain a majority quorum; the cluster is now moving into operational state.

Solution:

This is an informational message, no user action is needed.


526056 Resource <%s> of Resource Group <%s> failed pingpong check on node <%s>. The resource group will not be mastered by that node.

Description:

A scha_control(1HA,3HA) call has failed because no healthy new master could be found for the resource group. A given node is considered unhealthy for a given resource if that same resource has recently initiated a failover off of that node by a previous scha_control call. In this context, "recently" means within the past Pingpong_interval seconds, where Pingpong_interval is a user-configurable property of the resource group. The default value of Pingpong_interval is 3600 seconds. This check is performed to avoid the situation where a resource group repeatedly "ping-pongs" or moves back and forth between two or more nodes, which might occur if some external problem prevents the resource group from running successfully on *any* node.

Solution:

A properly-implemented resource monitor, upon encountering the failure of a scha_control call, should sleep for awhile and restart its probes. If the resource remains unhealthy, the problem that caused the scha_control call to fail (such as pingpong check described above) will eventually resolve, permitting a later scha_control request to succeed. Therefore, no user action is required. If the system administrator wishes to permit failovers to be attempted even at the risk of ping-pong behavior, the Pingpong_interval property of the resource group should be set to a smaller value.


526403 ff_open: %s

Description:

A server (rpc.pmfd or rpc.fed) was not able to establish a link to the failfast device, which ensures that the host aborts if the server dies. The error message is shown. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


526492 Service object [%s, %s, %d] removed from group '%s'

Description:

A specific service known by its unique name SAP (service access point), the three-tuple, has been deleted in the designated group.

Solution:

This is an informational message, no user action is needed.


526846 Daemon <%s> is not running.

Description:

The HA-NFS fault monitor detected that the specified daemon is no longer running.

Solution:

No action. The fault monitor would restart the daemon. If it doesn't happen, reboot the node.


527210 Unable to read %s: %s.


527795 clexecd: setrlimit returned %d

Description:

clexecd program has encountered a failed setrlimit() system call. The error message indicates the error number for the failure.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


528020 CCR: Remove table %s failed.

Description:

The CCR failed to remove the indicated table.

Solution:

The failure can happen due to many reasons, for some of which no user action is required because the CCR client in that case will handle the failure. The cases for which user action is required depends on other messages from CCR on the node, and include: If it failed because the cluster lost quorum, reboot the cluster. If the root file system is full on the node, then free up some space by removing unnecessary files. If the root disk on the afflicted node has failed, then it needs to be replaced. If the cluster repository is corrupted as indicated by other CCR messages, then boot the offending node(s) in -x mode to restore the cluster repository from backup. The cluster repository is located at /etc/cluster/ccr/.


528499 scsblconfig not configured correctly.

Description:

The specified file has not been configured correctly, or it does not have all the required settings.

Solution:

Please verify that required variables (according to the installation instructions for this data service) are correctly configured in this file. Try to manually source this file in korn shell ('. scsblconfig'), and verify if the required variables are getting set correctly.


528566 Method <%s> on resource <%s>, resource group <%s>, is_frozen=<%d>: Method timed out.

Description:

A method execution has exceeded its configured timeout and was killed by the rgmd. Depending on which method was being invoked and the Failover_mode setting on the resource, this might cause the resource group to fail over or move to an error state.

Solution:

Consult resource type documentation to diagnose the cause of the method failure. Other syslog messages occurring just before this one might indicate the reason for the failure. After correcting the problem that caused the method to fail, the operator may choose to issue an scswitch(1M) command to bring resource groups onto desired primaries. Note, if the indicated value of is_frozen is 1, this might indicate an internal error in the rgmd. Please save a copy of the /var/adm/messages files on all nodes, and report the problem to your authorized Sun service provider.


529131 Method <%s> on resource <%s>: RPC connection error.

Description:

An attempted method execution failed, due to an RPC connection problem. This failure is considered a method failure. Depending on which method was being invoked and the Failover_mode setting on the resource, this might cause the resource group to fail over or move to an error state; or it might cause an attempted edit of a resource group or its resources to fail.

Solution:

Examine other syslog messages occurring around the same time on the same node, to see if the cause of the problem can be identified. If the same error recurs, you might have to reboot the affected node. After the problem is corrected, the operator may choose to issue an scswitch(1M) command to bring resource groups onto desired primaries, or re-try the resource group update operation.


529191 clexecd: Sending fd to workerd returned %d. Exiting.

Description:

There was some error in setting up interprocess communication in the clexecd program.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


529407 resource group %s state on node %s change to %s

Description:

This is a notification from the rgmd that a resource group's state has changed. This may be used by system monitoring tools.

Solution:

This is an informational message, no user action is needed.


530064 reservation error(%s) - do_enfailfast() error for disk %s

Description:

The device fencing program has encountered errors while trying to access a device. All retry attempts have failed.

Solution:

For the user action required by this message, see the user action for message 192619.


530492 fatal: ucmm_initialize() failed

Description:

The daemon indicated in the message tag (rgmd or ucmmd) was unable to initialize its interface to the low-level cluster membership monitor. This is a fatal error, and causes the node to be halted or rebooted to avoid data corruption. The daemon produces a core file before exiting.

Solution:

Save a copy of the /var/adm/messages files on all nodes, and of the core file generated by the daemon. Contact your authorized Sun service provider for assistance in diagnosing the problem.


530603 Warning: Scalable service group for resource %s has already been created.

Description:

It was not expected that the scalable services group for the named resource existed.

Solution:

Rebooting all nodes of the cluster will cause the scalable services group to be deleted.


530828 Failed to disconnect from host %s and port %d.

Description:

The data service fault monitor probe was trying to disconnect from the specified host/port and failed. The problem may be due to an overloaded system or other problems. If such failure is repeated, Sun Cluster will attempt to correct the situation by either doing a restart or a failover of the data service.

Solution:

If this problem is due to an overloaded system, you may consider increasing the Probe_timeout property.


531148 fatal: thr_create stack allocation failure: %s (UNIX error %d)

Description:

The rgmd was unable to create a thread stack, most likely because the system has run out of swap space. The rgmd will produce a core file and will force the node to halt or reboot to avoid the possibility of data corruption.

Solution:

Rebooting the node has probably cured the problem. If the problem recurs, you might need to increase swap space by configuring additional swap devices. See swap(1M) for more information.


531560 No IPMP group for node %s.

Description:

No IPMP group has been specified for this node.

Solution:

If this error message has occured during resource creation, supply valid adapter information and retry it. If this message has occured after resource creation, remove the LogicalHostname resource and recreate it with the correct IPMP group for each node which is a potential master of the resource group.


531989 Prog <%s> step <%s>: authorization error: %s.

Description:

An attempted program execution failed, apparently due to a security violation; this error should not occur. The last portion of the message describes the error. This failure is considered a program failure.

Solution:

Correct the problem identified in the error message. If necessary, examine other syslog messages occurring at about the same time to see if the problem can be diagnosed. Save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance in diagnosing the problem.


532118 All the SUNW.HAStoragePlus resources that this resource depends on are not online on the local node. Skipping the checks for the existence and permissions of the start/stop/probe commands.

Description:

This is an informational message which means that the SUNW.HAStoragePlus resource(s) that this application resource depends on is not online on the local node and therefore the validation checks related to start/stop/probe commands can not be carried out on the local node.

Solution:

None.


532454 file specified in USER_ENV parameter %s does not exist

Description:

'User_env' property was set when configuring the resource. File specified in 'User_env' property does not exist or is not readable. File should be specified with fully qualified path.

Solution:

Specify existing file with fully qualified file name when creating resource. If resource is already created, please update resource property 'User_env'.


532636 Encountered an error while conducting checks of device services availability.

Description:

The start method of a HAStoragePlus resource has detected an error while verifying the availability of a device service. It is highly likely that a DCS function call returned an error.

Solution:

Contact your authorized Sun service provider for assistance in diagnosing the problem.


532654 The -c or -u flag must be specified for the %s method.

Description:

The arguments passed to the function unexpected omitted the given flags.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


532854 IPMP group %s has status %s so no action will be taken.

Description:

The status of the IPMP group has become stable.

Solution:

This is an informational message, no user action is needed.


532979 orbixd is started outside HA Broadvision. Stop orbixd and other BV processes running outside HA BroadVision.

Description:

The orbix daemon is probably started outside HA BroadVision. There shouldnot be any BV servers or daemons started outside HA BroadVision.

Solution:

Shutdown orbix daemon running outside HA BroadVision and also stop all BVservers started outside HA BroadVision. Delete the file/var/run/cluster/bv/bv_orbixd_lock_file if it exists and then restart the BV resources again.


532980 clcomm: Pathend %p: deferred task not allowed in state %d

Description:

The system maintains state information about a path. A deferred task is not allowed in this state.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


533359 pmf_monitor_suspend: Error opening procfs control file <%s> for tag <%s>: %s

Description:

The rpc.pmfd server was not able to resume the monitoring of a process because the rpc.pmfd server was not able to open a procfs control file. If the system error is 'Device busy', the procfs control file is being used by another command (like truss, pstack, dbx...) and the monitoring of this process remains suspended. If this is not the case, the monitoring of this process has been aborted and can not be resumed.

Solution:

If the system error is 'Device busy', stop the command which is using the procfs control file and issue the monitoring resume command again. Otherwise investigate if the machine is running out of memory. If this is not the case, save the syslog messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


534499 Failed to take the resource out of PMF control.

Description:

Sun Cluster was unable to remove the resource out of PMF control. This may cause the service to keep restarting on an unhealthy node.

Solution:

Look in /var/adm/messages for the cause of failure. Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


534512 in libsecurity svc_tp_create failed for transport %s

Description:

A server (rpc.pmfd, rpc.fed or rgmd) was not able to start because it could not create a rpc handle for the network specified. The rpc error message is shown. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


534826 clexecd: Error %d from start_failfast_server

Description:

clexecd program could not enable one of the mechanisms which causes the node to be shutdown to prevent data corruption, when clexecd program dies.

Solution:

To avoid data corruption, system will halt or reboot the node. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


535044 Creation of resource <%s> failed because none of the nodes on which VALIDATE would have run are currently up

Description:

In order to create a resource whose type has a registered VALIDATE method, the rgmd must be able to run VALIDATE on at least one node. However, all of the candidate nodes are down. "Candidate nodes" are either members of the resource group's Nodelist or members of the resource type's Installed_nodes list, depending on the setting of the resource's Init_nodes property.

Solution:

Boot one of the resource group's potential masters and retry the resource creation operation.


535181 Host %s is not valid.

Description:

Validation method has failed to validate the ip addresses.

Solution:

Invalid hostnames/ip addresses have been specified while creating resource. Recreate the resource with valid hostnames. Check the syslog message for the specific information.


535182 in libsecurity NETPATH=%s

Description:

A server (rpc.pmfd, rpc.fed or rgmd) was not able to start because it could not establish a rpc connection for the network specified, because it couldn't find any transport. This happened because either there are no available transports at all, or there are but none is a loopback. The NETPATH environment variable is shown. This error message is informational, and appears together with other messages appropriate for this situation. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


535886 Could not find a mapping for %s in %s. It is recommended that a mapping for %s be added to %s.

Description:

No mapping was found in the local hosts file for the specified ip address.

Solution:

Applications may use hostnames instead of ip addresses. It is recommended to have a mapping in the hosts file. Add an entry in the hosts file for the specified ip address.


536091 Failed to retrieve the cluster handle while querying for property %s: %s.

Description:

Access to the object named failed. The reason for the failure is given in the message.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


537175 CMM: node %s (nodeid: %ld, incarnation

Description:

The cluster can communicate with the specified node. A node becomes reachable before it is declared up and having joined the cluster.

Solution:

This is an informational message, no user action is needed.


537352 reservation error(%s) - do_scsi3_preemptandabort() error for disk %s

Description:

The device fencing program has encountered errors while trying to access a device. All retry attempts have failed.

Solution:

For the user action required by this message, see the user action for message 192619.


537380 Invalid option -%c for the validate method.

Description:

Invalid option is passed to validate call back method.

Solution:

This is an internal error. Contact your authorized Sun service provider for assistance in diagnosing and correcting the problem.


537498 Invalid value was returned for resource property %s for %s.

Description:

The value returned for the named property was not valid.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


537607 Not found clexecd on node %d for %d seconds. Retrying ...

Description:

Could not find clexecd to execute the program on a node. Indicated retry times.

Solution:

This is an informational message, no user action is needed.


538656 Restarting some BV daemons.

Description:

This message is from the BV probe . While the Dataservice is starting up, some BV daemons may have failed to startup. The probe will restart the daemons that didnt startup.

Solution:

Make sure the DB is available. The BV daemon startup may fail ifDB is not available. If the DB is available no user action needed. The BV probe will take appropriate action.


539760 Error parsing URL: %s. Will shutdown the WLS using sigkill

Description:

There is an error in parsing the URL needed to do a smooth shutdown. The WLS stop method however will go ahead and kill the WLS process.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


540024 INTERNAL ERROR: %s: %s.

Description:

This is an internal error.

Solution:

Contact your authorized Sun service provider for assistance in diagnosing and correcting the problem.


540274 got unexpected exception %s

Description:

An inter-node communication failed with an unknown exception.

Solution:

Examine syslog output for related error messages. Save a copy of the /var/adm/messages files on all nodes, and of the rgmd core file (if any). Contact your authorized Sun service provider for assistance in diagnosing the problem.


540376 Unable to change the directory to %s: %s. Current directory is /.

Description:

Callback method is failed to change the current directory . Now the callback methods will be executed in "/", so the core dumps from this callbacks will be located in "/".

Solution:

No user action needed. For detailed error message, check the syslog message.


540705 The DB probe script %s Timed out while executing

Description:

probing of the URL's set in the Server_url or the Monitor_uri_list failed. Before taking any action the WLS probe would make sure the DB is up. The Database probe script set in the extension property db_probe_script timed out while executing. The probe will not take any action till the DB is UP and the DB probe succeeds.

Solution:

Make sure the DB probe (set in db_probe_script) succeeds. Once the DB is started the WLS probe will take action on the failed WLS instance.


541180 Sun udlmlib library called with unknown option: '%c'

Description:

Unknown option used while starting up Oracle unix dlm.

Solution:

This is an internal error. Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.


541206 Couldn't read deleted directory: error (%d)

Description:

The file system is unable to create temporary copies of deleted files.

Solution:

Mount the affected file system as a local file system, and ensure that there is no file system entry with name "._" at the root level of that file system. Alternatively, run fsck on the device to ensure that the file system is not corrupt.


541445 Unable to compose %s path.

Description:

Unable to construct the path to the indicated file.

Solution:

Check system log messages to determine the underlying problem.


541818 Service group '%s' created

Description:

The service group by that name is now known by the scalable services framework.

Solution:

This is an informational message, no user action is needed.


543720 Desired Primaries is %d. It should be 1.

Description:

Invalid value for Desired Primaries property.

Solution:

Invalid value is set for Desired Primaries property. The value should be 1. Reset the property value using scrgadm(1M).


544252 Method <%s> on resource <%s>: Execution failed: no such method tag.

Description:

An internal error has occurred in the rpc.fed daemon which prevents method execution. This is considered a method failure. Depending on which method was being invoked and the Failover_mode setting on the resource, this might cause the resource group to fail over or move to an error state, or it might cause an attempted edit of a resource group or its resources to fail.

Solution:

Examine other syslog messages occurring at about the same time to see if the problem can be identified. Save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance in diagnosing the problem. Re-try the edit operation.


544380 Failed to retrieve the resource type handle: %s.

Description:

An API operation on the resource type has failed.

Solution:

For the resource type name, check the syslog tag. For more details, check the syslog messages from other components. If the error persists, reboot the node.


544592 PCSENTRY: %s

Description:

The rpc.pmfd server was not able to monitor a process, and the system error is shown. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


544775 libpnm system error: %s

Description:

A system error has occured in libpnm. This could be because of the resources on the system being very low. eg: low memory.

Solution:

The user of libpnm should handle these errors. However, if the message is out of memory - increase the swap space, install more memory or reduce peak memory consumption. Otherwise the error is unrecoverable, and the node needs to be rebooted. write error - check the "write" man page for possible errors. read error - check the "read" man page for possible errors. socket failed - check the "socket" man page for possible errors. TCP_ANONPRIVBIND failed - check the "setsockopt" man page for possible errors. gethostbyname failed %s - make sure entries in /etc/hosts, /etc/nsswitch.conf and /etc/netconfig are correct to get information about this host. bind failed - check the "bind" man page for possible errors.


546856 CCR: Could not find the CCR transaction manager.

Description:

The CCR data server could not find the CCR transaction manager in the cluster.

Solution:

Reboot the cluster. Also contact your authorized Sun service provider to determine whether a workaround or patch is available.


547057 thr_sigsetmask: %s

Description:

A server (rpc.pmfd or rpc.fed) was not able to establish a link with the failfast device because of a system error. The error message is shown. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


547145 Initialization error. Fault Monitor username is NULL

Description:

Internal error. Environment variable SYBASE_MONITOR_USER not set before invoking fault monitor.

Solution:

Report this problem to your authorized Sun service provider.


547301 reservation error(%s) error. Unknown internal error returned from clconf_do_execution().

Description:

The device fencing program has suffered an internal error.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available. Copies of /var/adm/messages from all nodes should be provided for diagnosis. It may be possible to retry the failed operation, depending on the nature of the error. If the message specifies the 'node_join' transition, then this node may be unable to access shared devices. If the failure occurred during the 'release_shared_scsi2' transition, then a node which was joining the cluster may be unable to access shared devices. In either case, it may be possible to reacquire access to shared devices by executing '/usr/cluster/lib/sc/run_reserve -c node_join' on all cluster nodes. If the failure occurred during the 'make_primary' transition, then a device group has failed to start on this node. If another node was available to host the device group, then it should have been started on that node. If desired, it may be possible to switch the device group to this node with the scswitch command. If no other node was available, then the device group will not have been started. The scswitch command may be used to retry the attempt to start the device group. If the failure occurred during the 'primary_to_secondary' transition, then the shutdown or switchover of a device group has failed. The desired action may be retried.


547385 dl_bind: bad ACK header %u

Description:

An unexpected error occurred. The acknowledgment header for the bind request (to bind to the physical device) is bad. We are trying to open a fast path to the private transport adapters.

Solution:

Reboot of the node might fix the problem.


548024 RGOffload resource cannot offload the resource group containing itself.

Description:

You may have configured an RGOffload resource to offload the resource group in which it is configured.

Solution:

Please reconfigure RGOffload resource to not offload the resource group in which it is configured.


548162 Error (%s) when reading standard property <%s>.

Description:

Error occurred in API call scha_resource_get.

Solution:

Check syslog messages for errors logged from other system modules. If error persists, please report the problem.


548237 Validation failed. Connect string contains 'sa' password.

Description:

The 'Connect_String' extension property used for fault monitoring uses 'sa' as the account password combination. This is a security risk, because the extension properties are accessible by everyone.

Solution:

Check the resource configuration and the value of the 'Connect_string'.property. Ensure that a dedicated account (with minimal privileges) is created for fault monitoring purposes.


549190 Text server successfully started.

Description:

The Sybase text server has been successfully started by Sun Cluster HA for Sybase.

Solution:

This is an information message, no user action is needed.


549709 Local node isn't in replica nodelist of service <%s> with path <%s>. No affinity switchover can be done

Description:

Local node does not support the replica of the service.

Solution:

No user action required.


549765 Preparing to start service %s.

Description:

Sun Cluster is preparing to start the specified application

Solution:

This is an informational message, no user action is needed.


549969 Error doing stat on device special file <%s> corresponding to path <%s>

Description:

The file system mount point can not be mapped to global service correctly as stat fails on the device special file corresponding to the file system mount point.

Solution:

Check the definition of mountpoint path in extension property "ServicePaths" of SUNW.HAStorage type resource and make sure they are for global file system with correct entries in /etc/vfstab.


550471 Failed to initialize the cluster handle: %s.

Description:

An API operation has failed while retrieving the cluster information.

Solution:

This may be solved by rebooting the node. For more details about API failure, check the messages from other components.


551094 reservation warning(%s) - Unable to open device %s, will retry in %d seconds

Description:

The device fencing program has encountered errors while trying to access a device. The failed operation will be retried

Solution:

This is an informational message, no user action is needed.


551139 Failed to initialize scalable services group: Error %d.

Description:

The data service in scalable mode was unable to register itself with the cluster networking.

Solution:

There may be prior messages in syslog indicating specific problems. Reboot the node if unable to correct the situation.


551436 libsecurity: clnt_authenticate failed

Description:

A client of the rpc.pmfd, rpc.fed or rgmd server was not able to initiate an rpc connection, because it failed the authentication process. The pmfadm or scha command exits with error. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


551654 Port (%d) determined from Monitor_Uri_List is invalid

Description:

The indicated port is not valid.

Solution:

Correct the port specified in Monitor_Uri_List.


553376 CMM: Open failed with error '(%s)' and errno = %d for quorum device '%s'\n. Unable to scrub device.

Description:

The open operation failed for the specified quorum device while it was being added into the cluster. The add of this quorum device will fail.

Solution:

The quorum device has failed or the path to this device may be broken. Refer to the disk repair section of the administration guide for resolving this problem. Retry adding the quorum device after the problem has been resolved.


555844 Cannot send reply: invalid socket

Description:

Solution:


556466 clexecd: dup2 of stdout returned with errno %d while exec'ing (%s). Exiting.

Description:

clexecd program has encountered a failed dup2(2) system call. The error message indicates the error number for the failure.

Solution:

The clexecd program will exit and the node will be halted or rebooted to prevent data corruption. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


556694 Nodes %u and %u have incompatible versions and will not communicate properly.

Description:

This is an informational message from the cluster version manager and may help diagnose which systems have incompatible versions with each other during a rolling upgrade. This error may also be due to attempting to boot a cluster node in 64-bit address mode when other nodes are booted in 32-bit address mode, or vice versa.

Solution:

This message is informational; no user action is needed. However, one or more nodes may shut down in order to preserve system integrity. Verify that any recent software installations completed without errors and that the installed packages or patches are compatible with the software installed on the other cluster nodes.


556945 No permission for owner to execute %s.

Description:

The specified path does not have the correct permissions as expected by a program.

Solution:

Set the permissions for the file so that it is readable and executable by the owner.


557585 No filesystem mounted on %s.

Description:

There is no filesystem mounted on the mount point.

Solution:

None. This is a debug message.


558350 Validation failed. Connect string is incomplete.

Description:

The 'Connect_String' extension property used for fault monitoring has been incorrectly specified. This has the format'username/password'.

Solution:

Check the resource configuration and the value of the 'Connect_String'property. Ensure that there are no spaces in the 'Connect_String'specification.


558763 Failed to start the WebLogic server configuredin resource %s

Description:

The WebLogic Server configured in the resource could not be started by the agent.

Solution:

Try to start the Weblogic Server manually and check if it can be started manually. Make sure the logical host is UP before starting the WLS manually. If it fails to start manually then check your configuration and make sure it can be started manually before it can be started by the agent. Make sure the extension properties are correct. Make sure the port is not already in use.


558742 Resource group %s is online on more than one node.

Description:

The named resource group should be online on only one node, but it is actually online on more than one node.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


558777:Enabling failfast on all shared disk devices.

Description:

A reservation failfast will be set so nodes which share these disk groups will be brought down if they are fenced off by other nodes.

Solution:

None.


559206 Either extension property <stop_signal> is not defined, or an error occured while retrieving this property; using the default value of SIGTERM.

Description:

Property stop_signal may not be defined in RTR file. Continue the process with the default value of SIGTERM.

Solution:

This is an informational message, no user action is needed.


559550 Error in opening /etc/vfstab: %s

Description:

Failed to open /etc/vfstab. Error message is followed.

Solution:

Check with system administrator and make sure /etc/vfstab is properly defined.


559614 Resource <%s> of Resource Group <%s> failed monitor check on node <%s>\n

Description:

Message logged for failed scha_control monitor check methods on specific node.

Solution:

No user action required.


559857 Starting liveCache with command %s failed. Return code is %d.

Description:

Starting liveCache failed.

Solution:

Check SAP liveCache log files and also look for syslog error messages on the same node for potential errors.


560047 UNIX DLM version (%d) and SUN Unix DLM library version (%d): compatible.

Description:

The Unix DLM is compatible with the installed version of libudlm.

Solution:

None.


560781 Tag %s: could not allocate history.

Description:

The rpc.pmfd server was not able to allocate memory for the history of the tag shown, probably due to low memory. The process associated with the tag is stopped and pmfadm returns error.

Solution:

Investigate if the machine is running out of memory. If this is not the case, save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


561862 PNM daemon config error: %s

Description:

A configuration error has occured in the PNM daemon. This could be because of wrong configuration/format etc.

Solution:

If the message is: IPMP group %s not found - either an IPMP group name has been changed or all the adapters in the IPMP group have been unplumbed. There would have been an earlier NOTICE which said that a particular IPMP group has been removed. The pnmd has to be restarted. Send a KILL (9) signal to the PNM daemon. Because pnmd is under PMF control, it will be restarted automatically. If the problem persists, restart the node with scswitch -S and shutdown(1M). IPMP group %s already exists - the user of libpnm (scrgadm) is trying to auto-create an IPMP group with a groupname that is already being used. Typically, this should not happen so contact your authorized Sun service provider to determine whether a workaround or patch or suggestion is available. Make a note of all the IPMP group names on your cluster. wrong format for /etc/hostname.%s - the format of /etc/hostname.<adp> file is wrong. Either it has the keyword group but no group name following it or the file has multiple lines. Correct the format of the file after going through the IPMP Admin Guide. The pnmd has to be restarted. Send a KILL (9) signal to the PNM daemon. Because pnmd is under PMF control, it will be restarted automatically. If the problem persists, restart the node with scswitch -S and shutdown(1M). We do not support multi-line /etc/hostname.adp file in auto-create since it becomes difficult to figure out which IP address the user wanted to use as the non-failover IP address.


562200 Application failed to stay up.

Description:

Solution:


562397 Failfast: %s.

Description:

A failfast client has encountered a deferred panic timeout and is going to panic the node. This may happen if a critical userland process, as identified by the message, dies unexpectedly.

Solution:

Check for core files of the process after rebooting the node and report these to your authorized Sun service provider.


563260 %s: Cannot create server handle.

Description:

The daemon is unable to provide RPC service due to unsuccessful call to get a service handle.


563288 Validation failed. The rac_framework resource %s, specified in the RESOURCE_DEPENDENCIES property, does not belong to the group specified in the resource group property RG_AFFINITIES

Description:

The resource being created or modified should be dependent upon the rac_framework resource in the RAC framework resource group.

Solution:

If not already created, create the RAC framework resource group and it's associated resources. Then specify the rac_framework resource for this resource's RESOURCE_DEPENDENCIES property.


563343 resource type %s updated.

Description:

This is a notification from the rgmd that the operator has edited a property of a resource type. This may be used by system monitoring tools.

Solution:

This is an informational message, no user action is needed.


563800 Failed to get all IPMP groups (request failed with %d).

Description:

An unexpected error occurred while trying to communicate with the network monitoring daemon (pnmd).

Solution:

Make sure the network monitoring daemon (pnmd) is running.


563847 INTERNAL ERROR: POSTNET_STOP method is not registered for resource <%s>

Description:

A non-fatal internal error has occurred in the rgmd state machine.

Solution:

Since this problem might indicate an internal logic error in the rgmd, please save a copy of the /var/adm/messages files on all nodes, the output of an scstat -g command, and the output of a scrgadm -pvv command. Report the problem to your authorized Sun service provider.


563976 Unable to get socket flags: %s.

Description:

Failed to get status flags for the socket used in communicating with the application.

Solution:

This is an internal error, no user action is required. Also contact your authorized Sun service provider.


564771 Error in reading /etc/vfstab: getvfsent() returns <%d>

Description:

Error in reading /etc/vfstab. The return code of getvfsent() is followed.

Solution:

Check with system administrator and make sure /etc/vfstab is properly defined.


564883 The Data base probe %s also failed. Will restart or failover the WLS only if the DB is UP

Description:

probing of the URL's set in the Server_url or the Monitor_uri_list failed. Before taking any action the WLS probe would make sure the DB is up (if a db_probe_script extension property is set). But, the DB probe also failed. The probe will not take any action till the DB is UP and the DB probe succeeds.

Solution:

Start the Data Base and make sure the DB probe (the script set in the db_probe_script) returns 0 for success. Once the DB is started the WLS probe will take action on the failed WLS instance.


565159 "pmfadm -s": Error signaling <%s>: %s

Description:

An error occured while rpc.pmfd attempted to send a signal to one of the processes of the given tag. The reason for the failure is also given. The signal was sent as a result of a 'pmfadm -s' command.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


565198 did subpath %s created for instance %d.

Description:

Informational message from scdidadm.

Solution:

No user action required.


565438 svc_run returned

Description:

The rpc.pmfd server was not able to run, due to an rpc error. This happens while the server is starting up, at boot time. The server does not come up, and an error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


565884 tag %s: command file %s is not executable

Description:

The rpc.fed server checked the command indicated by the tag, and this check failed because the command is not executable. An error message is output to syslog.

Solution:

Check the permission mode of the command, make sure that it is executable.


565978 Home dir is not set for user %s.

Description:

Home directory for the specified user is not set in the system.

Solution:

Check and make sure the home directory is set up correctly for the specified user.


566781 ORACLE_HOME %s does not exist

Description:

Directory specified as ORACLE_HOME does not exist. ORACLE_HOME property is specified when creating Oracle_server and Oracle_listener resources.

Solution:

Specify correct ORACLE_HOME when creating resource. If resource is already created, please update resource property 'ORACLE_HOME'.


567374 Failed to stop %s.

Description:

Sun Cluster failed to stop the application.

Solution:

Use process monitor facility (pmfadm (1M)) with -L option to retrieve all the tags that are running on the server. Identify the tag name for the application in this resource. This can be easily identified as the tag ends in the string ".svc" and contains the resource group name and the resource name. Then use pmfadm (1M) with -s option to stop the application. This problem may occur when the cluster is under load and Sun Cluster cannot stop the application within the timeout period specified. You may consider increasing the Stop_timeout property. If the error still persists, then reboot the node.


567610 PARAMTER_FILE %s does not exist

Description:

Oracle parameter file (typically init<sid>.ora) specified in property 'Parameter_file' does not exist or is not readable.

Solution:

Please make sure that 'Parameter_file' property is set to the existing Oracle parameter file. Reissue command to create/update the resource using correct 'Parameter_file'.


567783 %s - %s

Description:

The first %s refers to the calling program, whereas the second %s represents the output produced by that program. Typically, these messages are produced by programs such as strmqm, endmqm rumqtrm etc.

Solution:

None, if the command was successful. Otherwise, examine the other syslog messages occurring at the same time on the same node, to identify the cause of the problem.


567819 clcomm: Fixed size resource_pool short server threads: pool %d for client %d total %d

Description:

The system can create a fixed number of server threads dedicated for a specific purpose. The system expects to be able to create this fixed number of threads. The system could fail under certain scenarios without the specified number of threads. The server node creates these server threads when another node joins the cluster. The system cannot create a thread when there is inadequate memory.

Solution:

There are two possible solutions. Install more memory. Alternatively, reduce memory usage. Application memory usage could be a factor, if the error occurs when a node joins an operational cluster and not during cluster startup.


568162 Unable to create failfast thread

Description:

A server (rpc.pmfd or rpc.fed) was not able to start because it was not able to create the failfast thread, which ensures that the host aborts if the server dies. An error message is output to syslog.

Solution:

Investigate if the host is low on memory. If not, save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


568314 Failed to remove node %d from scalable service group %s: %s.

Description:

A call to the underlying scalable networking code failed.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


569559 Start of %s completed successfully.

Description:

The start command of the application completed successfully.

Solution:

No action required.


570394 reservation warning(%s) - USCSI_RESET failed for device %s, returned %d, will retry in %d seconds

Description:

The device fencing program has encountered errors while trying to access a device. The failed operation will be retried

Solution:

This is an informational message, no user action is needed.


570802 fatal: Got error <%d> trying to read CCR when disabling monitor of resource <%s>; aborting node

Description:

Rgmd failed to read updated resource from the CCR on this node.

Solution:

Save a copy of the /var/adm/messages files on all nodes, and of the rgmd core file. Contact your authorized Sun service provider for assistance in diagnosing the problem.


571642 ucm_callback for cmmreturn generated exception %d

Description:

ucmm callback for step cmmreturn failed.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


571734 Validation failed. ORACLE_SID is not set

Description:

ORACLE_SID property for the resource is not set. HA-Oracle will not be able to manage Oracle server if ORACLE_SID is incorrect.

Solution:

Specify correct ORACLE_SID when creating resource. If resource is already created, please update resource property 'ORACLE_SID'.


571825 Stopping listener %s.

Description:

Informational message. HA-Oracle will be stopping Oracle listener.

Solution:

None


571950 Fault monitor detected error %s: %ld Action=%s : %s

Description:

Fault monitor has detected an error. Error detected by fault monitor and action taken by fault monitor is indicated in message.

Solution:

None


572885 Pathprefix %s for resource group %s is not readable: %s.


572955 host %s: client is null

Description:

The rgm is not able to obtain an rpc client handle to connect to the rpc.fed server on the named host. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


574542 clexecd: fork1 returned %d. Exiting.

Description:

clexecd program has encountered a failed fork1(2) system call. The error message indicates the error number for the failure.

Solution:

If the error number is 12 (ENOMEM), install more memory, increase swap space, or reduce peak memory consumption. If error number is something else, contact your authorized Sun service provider to determine whether a workaround or patch is available.


574675 nodeid of ctxp is bad: %d

Description:

nodeid in the context pointer is bad.

Solution:

None. udlm takes appropriate action.


575351 Can't retrieve binding entries from node %d for GIF node %d

Description:

Failed to maintain client affinity for some sticky services running on the named server node. Connections from existing clients for those services might go to a different server node as a result.

Solution:

If client affinity is a requirement for some of the sticky services, say due to data integrity reasons, these services must be brought offline on the named node, or the node itself should be restarted.


575545 fatal: rgm_chg_freeze: INTERNAL ERROR: invalid value of rgl_is_frozen <%d> for resource group <%s>

Description:

The in-memory state of the rgmd has been corrupted due to an internal error. The rgmd will produce a core file and will force the node to halt or reboot to avoid the possibility of data corruption.

Solution:

Save a copy of the /var/adm/messages files on all nodes, and of the rgmd core file. Contact your authorized Sun service provider for assistance in diagnosing the problem.


575853 libsecurity: create of rpc handle to program %ld failed, will keep trying

Description:

A client of the rpc.pmfd, rpc.fed or rgmd server was not able to initiate an rpc connection. The maximum time allowed for connecting (1 hr) has not been reached yet, and the pmfadm or scha command will retry to connect. An accompanying error message shows the rpc error data. The program number is shown. To find out what program corresponds to this number, use the rpcinfo command. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


575875 CMM: Resetting bus for quorum device %s failed with error %d.

Description:

When a node connected to a quorum device goes down, the surviving node tries to reset the device's bus. That reset operation for the specified quorum device failed with the indicated error.

Solution:

Check to see if the disk identified above is accessible from the node the message was seen on. If it is accessible, then contact your authorized Sun service provider to determine whether a workaround or patch is available.


576196 clcomm: error loading kernel module: %d

Description:

The loading of the cl_comm module failed.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


576621 Command %s does not have execute permission set.

Description:

The specified pathname, which was passed to a libdsdev routine such as scds_timerun or scds_pmf_start, refers to a file that does not have execute permission set. This could be the result of 1) mis-configuring the name of a START or MONITOR_START method or other property, 2) a programming error made by the resource type developer, or 3) a problem with the specified pathname in the file system itself.

Solution:

Ensure that the pathname refers to a regular, executable file.


576744 INTERNAL ERROR: Invalid resource property type <%d> on resource <%s>

Description:

An attempted creation or update of a resource has failed because of invalid resource type data. This may indicate CCR data corruption or an internal logic error in the rgmd.

Solution:

Use scrgadm(1M) -pvv to examine resource properties. If the resource or resource type properties appear to be corrupted, the CCR might have to be rebuilt. If values appear correct, this may indicate an internal error in the rgmd. Re-try the creation or update operation. If the problem recurs, save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance.


577088 Waiting for WLS to startup

Description:

This is just a informational message that the WLS is still starting up.

Solution:

None


577140 clcomm: Exception during unmarshal_receive

Description:

The server encountered an exception while unmarshalling the arguments for a remote invocation. The system prints the exception causing this error.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


579190 INTERNAL ERROR: resource group <%s> state <%s> node <%s> contains resource <%s> in state <%s>

Description:

The rgmd has discovered that the indicated resource group's state information appears to be incorrect. This may prevent any administrative actions from being performed on the resource group.

Solution:

Since this problem might indicate an internal logic error in the rgmd, please save a copy of the /var/adm/messages files on all nodes, the output of an scstat -g command, and the output of a scrgadm -pvv command. Report the problem to your authorized Sun service provider.


579235 Method <%s> on resource <%s> terminated abnormally

Description:

A resource method terminated without using an exit(2) call. The rgmd treats this as a method failure.

Solution:

Consult resource type documentation, or contact the resource type developer for further information.


579531 dfstab file %s is empty.


579987 Error binding '%s' in the name server. Exiting.

Description:

clexecd program was unable to start because of some problems in the low-level clustering software.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


580124 All device services are verified to be available.

Description:

All device services specified directly or indirectly via the GlobalDevicePath and FilesystemMountPoint extension properties respectively are found to be available i.e up and running.

Solution:


580163 reservation warning(%s) - MHIOCTKOWN error will retry in %d seconds

Description:

The device fencing program has encountered errors while trying to access a device. The failed operation will be retried

Solution:

This is an informational message, no user action is needed.


580416 Cannot restart monitor: Monitor is not enabled.

Description:

An update operation on the resource would have been restarted the fault monitor. But, the monitor is currently disabled for the resource.

Solution:

This is informational message. Check whether the monitor is disabled for the resource. If not, consider it as an internal error and contact your authorized Sun service provider.


581180 launch_validate: call to rpc.fed failed for resource <%s>, method <%s>

Description:

The rgmd failed in an attempt to execute a VALIDATE method, due to a failure to communicate with the rpc.fed daemon. If the rpc.fed process died, this might lead to a subsequent reboot of the node. Otherwise, this will cause the failure of a creation or update operation on a resource or resource group.

Solution:

Examine other syslog messages occurring at about the same time to see if the problem can be identified. Re-try the creation or update operation. If the problem recurs, save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance.


581376 clcomm: solaris xdoor: too much reply data

Description:

The reply from a user level server will not fit in the available space.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


581413 Daemon %s is not running.

Description:

HA-NFS fault monitor checks the health of statd, lockd, mountd and nfsd daemons on the node. It detected that one these are not currently running.

Solution:

No action. The monitor would restart these.


581898 Application failed to stay up. Start method Failure.

Description:

The application being started under pmf has exited. Either the user has decided to stop monitoring this process, or the process exceeded the number of retries. An error message is output to syslog.

Solution:

Check syslog messages and correct the problems specified in prior syslog messages. If the error still persists, please report this problem.


581902 (%s) invalid timeout '%d'

Description:

Invalid timeout value for a method.

Solution:

Make sure udlm.conf file has correct timeouts for methods.


582353 Adaptive server shutdown with wait failed. STOP_FILE %s.

Description:

The Sybase adaptive server failed to shutdown with the wait option using the file specified in the STOP_FILE property.

Solution:

This is an informational message, no user action is needed.


582418 Validation failed. SYBASE ASE startserver file not found SYBASE=%s.

Description:

The Sybase Adaptive server is started by execution of the'startserver' file. This file is missing. The SYBASEdirectory is specified as a part of this error message.

Solution:

Verify the Sybase installation including the existence and properpermissions of the 'startserver' file in the $SYBASE/$SYBASE_ASE/install directory.


582651 tag %s: does not belong to caller

Description:

The user sent a suspend/resume command to the rpc.fed server for a tag that was started by a different user. An error message is output to syslog.

Solution:

Check the tag name.


582757 No PDT Fastpath thread.

Description:

The system has run out of resources that is required to create a thread. The system could not create the Fastpath thread that is required for cluster networking.

Solution:

If cluster networking is required, add more resources (most probably, memory) and reboot.


583138 dfstab not readable

Description:

HA-NFS fault monitor failed to read dfstab when it detected that dfstab has been modified.

Solution:

Make sure the dfstab file exists and has read permission set appropriately. Look at the prior syslog messages for any specific problems and correct them.


583224 Rebooting this node because daemon %s is not running.

Description:

The rpcbind daemon on this node is not running.

Solution:

No action. Fault monitor would reboot the node. Also see message id 804791.


583542 clcomm: Pathend: would abort node because %s for %u ms

Description:

The system would have aborted the node for the specified reason if the check for send thread running was enabled.

Solution:

No user action is required.


583563 fatal: rgm_run_state: internal error: bad state <%d> for resource group <%s>

Description:

An internal error has occurred. The rgmd will produce a core file and will force the node to halt or reboot to avoid the possibility of data corruption.

Solution:

Save a copy of the /var/adm/messages files on all nodes, and of the rgmd core file. Contact your authorized Sun service provider for assistance in diagnosing the problem.


583970 Failed to shutdown liveCache immediately with command %s.

Description:

Stopping SAP liveCache with the specified command failed to complete.

Solution:

Look for other syslog error messages on the same node. Save a copy of the /var/adm/messages files on all nodes, and report the problem to your authorized Sun service provider.


584207 Stopping %s.

Description:

Sun Cluster is stopping the specified application.

Solution:

This is an informational message, no user action is needed.


584386 PENDING_OFFLINE: bad resource state <%s> (%d) for resource <%s>

Description:

The rgmd state machine has discovered a resource in an unexpected state on the local node. This should not occur and may indicate an internal logic error in the rgmd.

Solution:

Look for other syslog error messages on the same node. Save a copy of the /var/adm/messages files on all nodes, and report the problem to your authorized Sun service provider.


585726 pthread_cond_init: %s

Description:

Solution:


586298 clcomm: unknown type of signals message %d

Description:

The system has received a signals message of unknown type.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


586344 clcomm: unable to unbind %s from name server

Description:

The name server would not unbind the specified entity.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


586689 Cannot access the %s command <%s> : <%s>

Description:

The command input to the agent builder is not accessible and executable. This may be due to the program not existing or the permissions not being set properly.

Solution:

Make sure the program in the command exists, is in the proper directory, and has read and execute permissions set appropriately.


588055 Failed to retrieve information for SAP xserver user %s.

Description:

The SAP xserver user is not found on the system.

Solution:

Make sure the user is available on the system.


589373 scha_control RESOURCE_RESTART failed with error code: %s

Description:

Fault monitor had detected problems in Oracle listener. Attempt to switchover resource to another node failed. Error returned by API call scha_control is indicated in the message. If problem persists, fault monitor will make another attempt to restart the listener.

Solution:

Check Oracle listener setup. Please make sure that Listener_name specified in the resource property is configured in listener.ora file. Check 'Host' property of listener in listener.ora file. Examine log file and syslog messages for additional information.


589719 Issuing failover request.

Description:

This is informational message. We are above to call API function to request for failover. In case of failure, follow the syslog messages after this message.

Solution:

No user action is needed.


589817 clcomm: nil_sendstream::send

Description:

The system attempted to use a "send" operation for a local invocation. Local invocations do not use a "send" operation.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


590263 Online check Error %s : %ld

Description:

Error detected when checking ONLINE status of RDBMS. Error number is indicated in message.This can be because of RDBMS server problems or configuration problems.

Solution:

Check RDBMS server using vendor provided tools. If server is running properly, this can be fault monitor set-up error.


590357 %s: Unable to register RPC service.

Description:

The daemon is unable to provide RPC service because there is another daemon already registered with same service name or there is an error in configuration.


590454 TCPTR: Machine with MAC address %s is using cluster private IP address %s on a network reachable from me. Path timeouts are likely.

Description:

The transport at the local node detected an arp cache entry that showed the specified MAC address for the above IP address. The IP address is in use at this cluster on the private network. However, the MAC address is a foreign MAC address. A possible cause is that this machine received an ARP request from another machine that does not belong to this cluster, but hosts the same IP address using the above MAC address on a network accessible from this machine. The transport has temporarily corrected the problem by flushing the offending arp cache entry. However, unless corrective steps are taken, TCP/IP communication over the relevant subnet of the private interconnect might break down, thus causing path downs.

Solution:

Make sure that no machine outside this cluster hosts this IP address on a network reachable from this cluster. If there are other Sun clusters sharing a public network with this cluster, please make sure that their private network adapters are not miscabled to the public network. By default all Sun clusters use the same set of IP addresses on their private networks.


590700 ALERT_LOG_FILE %s doesn't exist

Description:

File specified in resource property 'Alert_log_file' does no exist. HA-Oracle requires correct Alert Log file for fault monitoring.

Solution:

Check 'Alert_log_file' property of the resource. Specify correct Oracle Alert Log file when creating resource. If resource is already created, please update resource property Alert_log_file'.


592233 setrlimit(RLIMIT_NOFILE): %s

Description:

The rpc.pmfd server was not able to set the limit of files open. The message contains the system error. This happens while the server is starting up, at boot time. The server does not come up, and an error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


592285 clexecd: getrlimit returned %d

Description:

clexecd program has encountered a failed getrlimit(2) system call. The error message indicates the error number for the failure.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


592378 Resource %s is not online anywhere.

Description:

The named resource is not online on any cluster node.

Solution:

None. This is an informational message.


592920 sigemptyset: %s

Description:

The rpc.fed server encountered an error with the sigemptyset function, and was not able to start. The message contains the system error.

Solution:

Save the syslog messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


593330 Resource type name is null.

Description:

This is an internal error. While attempting to retrieve the resource information, null value was retrieved for the resource type name.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


594629 Failed to stop the fault monitor.

Description:

Process monitor facility has failed to stop the fault monitor.

Solution:

Use pmfadm(1M) with -L option to retrieve all the tags that are running on the server. Identify the tag name for the fault monitor of this resource. This can be easily identified, as the tag ends in string ".mon" and contains the resource group name and the resource name. Then use pmfadm (1M) with -s option to stop the fault monitor. If the error still persists, then reboot the node.


594675 reservation warning(%s) - MHIOCGRP_REGISTER error will retry in %d seconds

Description:

The device fencing program has encountered errors while trying to access a device. The failed operation will be retried

Solution:

This is an informational message, no user action is needed.


595077 Error in hasp_check. Validation failed.

Description:

Internal error occured in hasp_check.

Solution:

Check the errors logged in the syslog messages by hasp_check. Please verify existance of /usr/cluster/bin/hasp_check binary. Please report this problem.


595101 t_sndudata in send_reply: %s

Description:

Call to t_sndudata() failed. The "t_sndudata" man page describes possible error codes. udlm will try to resend the message. abort.

Solution:

None.


595686 %s is %d for %s. It should be 1.

Description:

The named property has an unexpected value.

Solution:

Change the value of the property to be 1.


595926 Stopping Adaptive server with nowait option.

Description:

The Sun Cluster HA for Sybase will retry the shutdown using the nowait option.

Solution:

This is an informational message, no user action is needed.


596447 UNIX DLM is asking for a reconfiguration to recover from a communication error.

Description:

A reconfiguration has been requested by udlm.

Solution:

None.


596604 clcomm: solookup on routing socket failed with error = %d

Description:

The system prepares IP communications across the private interconnect. A lookup operation on the routing socket failed.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


597171 Unexpected early exit while performing: '%s'

Description:

clexecd program got an error while executing the program indicated in the error message.

Solution:

Please check the error message. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


597239 The weight portion of %s at position %d in property %s is not a valid weight. The weight should be an integer between %d and %d.

Description:

The weight noted does not have a valid value. The position index, which starts at 0 for the first element in the list, indicates which element in the property list was invalid.

Solution:

Give the weight a valid value.


597381 setrlimit before exec: %s

Description:

rpc.pmfd was unable to set the number of file descriptors before executing a process.

Solution:

Look for other syslog error messages on the same node. Save a copy of the /var/adm/messages files on all nodes, and report the problem to your authorized Sun service provider.


598087 PCWSTOP: %s

Description:

The rpc.pmfd server was not able to monitor a process, and the system error is shown. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


598259 scvxvmlg fatal error - ckmode received unknown mode %d

Description:

The program responsible for maintaining the VxVM namespace has suffered an internal error. If configuration changes were recently made to VxVM diskgroups or volumes, this node may be unaware of those changes. Recently created volumes may be unaccessible from this node.

Solution:

If no configuration changes have been recently made to VxVM diskgroups or volumes and all volumes continue to be accessible from this node, then no action is required. If changes have been made, the device namespace on this node can be updated to reflect those changes by executing '/usr/cluster/lib/dcs/scvxvmlg'. If the problem persists, contact your authorized Sun service provider to determine whether a workaround or patch is available.


598540 clcomm: solaris xdoor: completed invo: door_return returned, errno = %d

Description:

An unusual but harmless event occurred. System operations continue unaffected.

Solution:

No user action is required.


598554 launch_validate_method: getlocalhostname() failed for resource <%s>, resource group <%s>, method <%s>

Description:

The rgmd was unable to obtain the name of the local host, causing a VALIDATE method invocation to fail. This in turn will cause the failure of a creation or update operation on a resource or resource group.

Solution:

Examine other syslog messages occurring at about the same time to see if the problem can be identified. Re-try the creation or update operation. If the problem recurs, save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance.


598979 tag %s: already suspended

Description:

The user sent a suspend command to the rpc.fed server for a tag that is already suspended. An error message is output to syslog.

Solution:

Check the tag name.


599371 Failed to stop the WebLogic server smoothly.Will try killing the process using sigkill

Description:

The Smooth shutdown of the WLS failed. The WLS stop method however will go ahead and kill the WLS process.

Solution:

Check the WLS logs for more details. Check the /var/adm/messages and the syslogs for more details to fix the problem.


599430 Failed to retrieve the resource property %s: %s.

Description:

An API operation has failed while retrieving the resource property. Low memory or API call failure might be the reasons.

Solution:

In case of low memory, the problem will probably cured by rebooting. If the problem reoccurs, you might need to increase swap space by configuring additional swap devices. Otherwise, if it is API call failure, check the syslog messages from other components. For the resource name and property name, check the current syslog message.


599558 SIOCLIFADDIF of %s failed: %s.

Description:

Specified system operation failed

Solution:

This is as an internal error. Contact your authorized Sun service provider with the following information. 1) Saved copy of /var/adm/messages file. 2) Output of "ifconfig -a" command.

Message IDs 600000–699999


600967 Could not allocate buffer for DBMS log messages: %m

Description:

Fault monitor could not allocate memory for reading RDBMS log file. As a result of this error, fault monitor will not scan errors from log file. However it will continue fault monitoring.

Solution:

Check if system is low on memory. If problem persists, please stop and start the fault monitor.


601852 rpcbind is not responding, however /tmp/portmap.file and /tmp/rpcbind.file exist. rpcbind can be restarted with /usr/sbin/rpcbind -w. Not taking any action.


601901 Failed to retrieve the resource property %s for %s: %s.

Description:

The query for a property failed. The reason for the failure is given in the message.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


603096 resource %s disabled.

Description:

This is a notification from the rgmd that the operator has disabled a resource. This may be used by system monitoring tools.

Solution:

This is an informational message, no user action is needed.


603490 Only a single path to the WLS Home directoryhas to be set in Confdir_list

Description:

Only one single path to the WLS home directory has to be set in the Confdir_list property. The resource creation will fail if multiple home directories are configured in Confdir_list.

Solution:

The Confdir_list extension property takes only one single path to the WLS home directory. Set a single path and create the resource again.


604153 clcomm: Path %s errors during initiation

Description:

Communication could not be established over the path. The interconnect may have failed or the remote node may be down.

Solution:

Any interconnect failure should be resolved, and/or the failed node rebooted.


604784 Probing SAP xserver timed out with command %s.

Description:

Probing the SAP xserver with the listed command timed out.

Solution:

Other syslog messages occurring just before this one might indicate the reason for the failure. You might consider increase the time out value for the method error was generated from.


605102 This node can be a primary for scalable resource %s, but there is no IPMP group defined on this node. A IPMP group must be created on this node.

Description:

The node does not have a IPMP group defined.

Solution:

Any adapters on the node which are connected to the public network should be put under IPMP control by placing them in a IPMP group. See the ifconfig(1M) man page for details.


605301 lkcm_sync: invalid handle was passed %s%d

Description:

Invalid handle passed during lockstep execution.

Solution:

This is an internal error. Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.


606203 Couldn't get the root vnode: error (%d)

Description:

The file system is corrupt or was not mounted correctly.

Solution:

Run fsck, and mount the affected file system again.


606362 The stop command <%s> failed to stop the application. Will now use SIGKILL to stop the application.

Description:

The user provided stop command cannot stop the application. Will re-attempt to stop the application by sending SIGKILL to the pmf tag.

Solution:

No action required.


606467 CMM: Initialization for quorum device %s failed with error EACCES. Will retry later.

Description:

This node is not able to access the specified quorum device because the node is still fenced off. An attempt will be made to access the quorum device again after the node's CCR has been recovered.

Solution:

This is an informational message, no user action is needed.


607054 %s not found.

Description:

Could not find the binary to startup udlm.

Solution:

Make sure the unix dlm package is installed properly.


607613 transition '%s' timed out for cluster, as did attempts to reconfigure.

Description:

Step transition failed. udlmctl will exit.

Solution:

Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.


607678 clconf: No valid quorum_resv_key field for node %u

Description:

Found the quorum_resv_key field being incorrect while converting the quorum configuration information into quorum table.

Solution:

Check the quorum configuration information.


608202 scha_control: resource group <%s> was frozen on Global_resources_used within the past %d seconds; exiting

Description:

A scha_control call has failed with a SCHA_ERR_CHECKS error because the resource group has a non-null Global_resources_used property, and a global device group was failing over within the indicated recent time interval. The resource fault probe is presumed to have failed because of the temporary unavailability of the device group. A properly-written resource monitor, upon getting the SCHA_ERR_CHECKS error code from a scha_control call, should sleep for awhile and restart its probes.

Solution:

No user action is required. Either the resource should become healthy again after the device group is back online, or a subsequent scha_control call should succeed in failing over the resource group to a new master.


608286 Stopping the text server.

Description:

The Text server is about to be brought down by Sun Cluster HAfor Sybase.

Solution:

This is an information message, no user action is needed.


608453 failfast disarm error: %d

Description:

Error during a failfast device disarm operation.

Solution:

None.


608876 PCRUN: %s

Description:

The rpc.pmfd server was not able to monitor a process, and the system error is shown. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


609118 Error creating deleted directory: error (%d)

Description:

While mounting this file system, PXFS was unable to create some directories that it reserves for internal use.

Solution:

If the error is 28(ENOSPC), then mount this FS non-globally, make some space, and then mount it globally. If there is some other error, and you are unable to correct it, contact your authorized Sun service provider to determine whether a workaround or patch is available.


610273 IPMP group %s has failed, so scalable resource %s in resource group %s may not be able to respond to client requests. A request will be issued to relocate resource %s off of this node.

Description:

The named IPMP group has failed, so the node may not be able to respond to client requests. It would be desirable to move the resource to another node that has functioning IPMP groups. A request will be issued on behalf of this resource to relocate the resource to another node.

Solution:

Check the status of the IPMP group on the node. Try to fix the adapters in the IPMP group.


612049 resource <%s> in resource group <%s> depends on disabled network address resource <%s>

Description:

An enabled application resource was found to implicitly depend on a network address resource that is disabled. This error is non-fatal but may indicate an internal logic error in the rgmd.

Solution:

Look for other syslog error messages on the same node. Save a copy of the /var/adm/messages files on all nodes, and report the problem to your authorized Sun service provider.


612117 Failed to stop Text server.

Description:

Sun Cluster HA for Sybase failed to stop text server.

Solution:

Please examine whether any Sybase server processes are running on the server. Please manually shutdown the server.


612124 Volume configuration daemon not running.

Description:

Volume manager is not running.

Solution:

Bring up the volume manager.


612931 Unable to get device major number for %s driver: %s.

Description:

System was unable to translate the given driver name into device major number.

Solution:

Check whether the /etc/name_to_major file is corrupted. Reboot the node if problem persists.


613458 All devices services started successfully.

Description:

All device services specified directly or indirectly via the GlobalDevicePath and FilesystemMountPoint extension properties respectively are started on a given node.

Solution:


613522 clexecd: Error %d from poll. Exiting.

Description:

clexecd program has encountered a failed poll(2) system call. The error message indicates the error number for the failure.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


613896 INTERNAL ERROR: process_resource: Resource <%s> is R_BOOTING in PENDING_OFFLINE or PENDING_DISABLED resource group

Description:

The rgmd is attempting to bring a resource group offline on a node where BOOT methods are still being run on its resources. This should not occur and may indicate an internal logic error in the rgmd.

Solution:

Look for other syslog error messages on the same node. Save a copy of the /var/adm/messages files on all nodes, and report the problem to your authorized Sun service provider.


613984 scha_control: request failed because the given resource group <%s> does not contain the given resource <%s>

Description:

A resource monitor (or some other program) is attempting to initiate a restart or failover on the indicated resource and group by calling scha_control(1ha),(3ha). However, the indicated resource group does not contain the indicated resource, so the request is rejected. This represents a bug in the calling program.

Solution:

The resource group may be restarted manually on the same node or switched to another node by using scswitch(1m) or the equivalent GUI command. Contact the author of the data service (or of whatever program is attempting to call scha_control) and report the error.


614706 Failed to create directory: <%s>


615120 fatal: unknown scheduling class '%s'

Description:

An internal error has occurred. The daemon indicated in the message tag (rgmd or ucmmd) will produce a core file and will force the node to halt or reboot to avoid the possibility of data corruption.

Solution:

Save a copy of the /var/adm/messages files on all nodes, and of the core file generated by the daemon. Contact your authorized Sun service provider for assistance in diagnosing the problem.


616999 did reconfiguration discovered invalid diskpath This path must be removed before a new path can be added. Please run did cleanup (-C) then re-run did reconfiguration (-r).\n

Description:

During scdidadm -r reconfiguration, a non-existent diskpath was found in the current namespace. This must be cleaned up before any new subpath can be added by scdidadm.

Solution:

Run devfsadm -C, then scdidadm -C then re-run scdidadm -r.


617643 Unable to fork(): %s.

Description:

Upon an IPMP failure, the system was unable to take any action, because it failed to fork another process.

Solution:

This might be the result from the lack of the system resources. Check whether the system is low in memory or the process table is fulli, and take appropriate action. For specific error information check the syslog message.


617917 Initialization failed. Invalid command line %s %s

Description:

Unable to process parameters passed to the call back method. This is an internal error.

Solution:

Please report this problem.


618107 Path %s initiation encountered errors, errno = %d. Remote node may be down or unreachable through this path.

Description:

Communication with another node could not be established over the path.

Solution:

Any interconnect failure should be resolved, and/or the failed node rebooted.


618466 Unix DLM no longer running

Description:

UNIX DLM is expected to be running, but is not. This will result in a udlmstep1 failure.

Solution:

This is an internal error. Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.


618585 clexecd: getmsg returned %d. Exiting.

Description:

clexecd program has encountered a failed getmsg(2) system call. The error message indicates the error number for the failure.

Solution:

The clexecd program will exit and the node will be halted or rebooted to prevent data corruption. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


618637 The port number %d from entry %s in property %s was not found in config file <%s>.

Description:

All entries in the list property must have port numbers that correspond to ports configured in the configuration file. The port number from the list entry does not correspond to a port in the configuration file.

Solution:

Remove the entry or change its port number to correspond to a port in the configuration file.


618764 fe_set_env_vars() failed for Resource <%s>, resource group <%s>, method <%s>

Description:

The rgmd was unable to set up environment variables for a method execution, causing the method invocation to fail. Depending on which method was being invoked and the Failover_mode setting on the resource, this might cause the resource group to fail over or move to an error state.

Solution:

Examine other syslog messages occurring at about the same time to see if the problem can be identified. Save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance in diagnosing and correcting the problem.


619171 Failed to retrieve information for user %s for SAP system %s.

Description:

Failed to retrieve home directory for the specified SAP user for the specified system ID.

Solution:

Check the system ID for SAP. SAPSID is case sensitive.


619184 %s: Unable to register callback function.

Description:

The daemon has encountered error in RPC.


619213 t_alloc (recv_request) failed with error %d

Description:

Call to t_alloc() failed. The "t_alloc" man page describes possible error codes. ucmmd will exit and the node will abort.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


619312 "%s" restarting too often ... sleeping %d seconds.

Description:

The tag shown, run by rpc.pmfd server, is restarting and exiting too often. This means more than once a minute. This can happen if the application is restarting, then immediately exiting for some reason, then the action is executed and returns OK (0), which causes the server to restart the application. When this happens, the rpc.pmfd server waits for up to 1 min before it restarts the application. An error message is output to syslog.

Solution:

Examine the state of the application, try to figure out why the application doesn't stay up, and yet the action returns OK.


620204 Failed to start scalable service.

Description:

Unable to configure service for scalability.

Solution:

The start method on this node will fail. Sun Cluster resource management will attempt to start the service on some other node.


621686 CCR: Invalid checksum length %d in table %s, expected %d.

Description:

The checksum of the indicated table has a wrong size. This causes the consistency check of the indicated table to fail.

Solution:

Boot the offending node in -x mode to restore the indicated table from backup or other nodes in the cluster. The CCR tables are located at /etc/cluster/ccr/.


622387 constchar*fmt

Description:

Function definition. Please ignore

Solution:

None


623528 clcomm: Unregister of adapter state proxy failed

Description:

The system failed to unregister an adapter state proxy.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


623635 Warning: Failed to configure client affinity for group %s: %s

Description:

A call to the underlying scalable networking code failed.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


623759 svc_setschedprio: Could not lookup RT (real time) scheduling class info: %s

Description:

The server was not able to determine the scheduling mode info, and the system error is shown. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


624265 Text server terminated.

Description:

Text server processes were stopped in STOP method.

Solution:

None


624447 fatal: sigaction: %s (UNIX errno %d)

Description:

The rgmd has failed to initialize signal handlers by a call to sigaction(2). The error message indicates the reason for the failure. The rgmd will produce a core file and will force the node to halt or reboot to avoid the possibility of data corruption.

Solution:

Save a copy of the /var/adm/messages files on all nodes, and of the rgmd core file. Contact your authorized Sun service provider for assistance in diagnosing the problem.


626478 stat of file %s failed: <%s>.

Description:

There was a failure to stat the specified file.

Solution:

Make sure that the file exists.


627375 in libsecurity: file %s not readable or bad content

Description:

The rpc.pmfd, rpc.fed or rgmd server was not able to read an rpcbind information cache file, or the file's contents are corrupted. The affected component should continue to function by calling rpcbind directly.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


627610 clconf: Invalid clconf_obj type

Description:

An invalid clconf_obj type has been encountered while converting an clconf_obj type to group name. Valid objtypes are "CL_CLUSTER", "CL_NODE", "CL_ADAPTER", "CL_PORT", "CL_BLACKBOX", "CL_CABLE", "CL_QUORUM_DEVICE".

Solution:

This is an unrecoverable error, and the cluster needs to be rebooted. Also contact your authorized Sun service provider to determine whether a workaround or patch is available.


628203 in libsecurity could not find any tcp transport

Description:

A client was not able to make an rpc connection to a server (rpc.pmfd, rpc.fed or rgmd) because it could not find a tcp transport. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


628771 CCR: Can't read CCR metadata.

Description:

Reading the CCR metadata failed on this node during the CCR data server initialization.

Solution:

There may be other related messages on this node, which may help diagnose the problem. For example: If the root disk on the afflicted node has failed, then it needs to be replaced. If the cluster repository is corrupted, then boot this node in -x mode to restore the cluster repository from backup or other nodes in the cluster. The cluster repository is located at /etc/cluster/ccr/.


629154 Validation failed. Resource group property RG_AFFINITIES should specify a SCALABLE resource group containing the RAC framework resources

Description:

The resource being created or modified must belong to a group that has an affinity with the SCALABLE RAC framework resource group.

Solution:

If not already created, create the RAC framework resource group and it's associated resources. Then specify the RAC resource group for this resource's group RG_AFFINITIES property.


629584 pthread_mutex_init: %s

Description:

Solution:


630462 Error trying to get logical hostname: <%s>.

Description:

There was an error while trying to get the logical hostname. The reason for the error is specified.

Solution:

Save a copy of /var/adm/messages from all nodes of the cluster and contact your Sun support representative for assistance.


630653 Failed to initialize DCS

Description:

There was a fatal error while this node was booting.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


630971 No memory


631373 Fatal error; aborting the rpc.fed daemon.

Description:

The rpc.fed server experienced an unrecoverable error, and is aborting the node.

Solution:

Save the syslog messages file. Examine other syslog messages occurring around the same time on the same node, to see if the cause of the problem can be identified. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


631408 PCSET: %s

Description:

The rpc.pmfd server was not able to monitor a process, and the system error is shown. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


631429 huge address size %d

Description:

Size of MAC address in acknowledgment of the bind request exceeds the maximum size allotted. We are trying to open a fast path to the private transport adapters.

Solution:

Reboot of the node might fix the problem.


631648 Retrying to retrieve the resource group information.

Description:

An update to cluster configuration occured while resource group properties were being retrieved

Solution:

Ignore the message.


632435 Error: attempting to copy larger list to smaller

Description:

Solution:


632645 Timeout retrieving result of bind to %s port %d for non-secure resource %s

Description:

An error occurred while fault monitor attempted to probe the health of the data service.

Solution:

Wait for the fault monitor to correct this by doing restart or failover. For more error description, look at the syslog messages.


633457 reservation fatal error(%s) - my_map_to_did_device() error in is_scsi3_disk()

Description:

The device fencing program has suffered an internal error.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available. Copies of /var/adm/messages from all nodes should be provided for diagnosis. It may be possible to retry the failed operation, depending on the nature of the error. If the message specifies the 'node_join' transition, then this node may be unable to access shared devices. If the failure occurred during the 'release_shared_scsi2' transition, then a node which was joining the cluster may be unable to access shared devices. In either case, it may be possible to reacquire access to shared devices by executing '/usr/cluster/lib/sc/run_reserve -c node_join' on all cluster nodes. If the failure occurred during the 'make_primary' transition, then a device group has failed to start on this node. If another node was available to host the device group, then it should have been started on that node. If desired, it may be possible to switch the device group to this node with the scswitch command. If no other node was available, then the device group will not have been started. The scswitch command may be used to retry the attempt to start the device group. If the failure occurred during the 'primary_to_secondary' transition, then the shutdown or switchover of a device group has failed. The desired action may be retried.


633745 pthread_kill: %s

Description:

The rpc.fed server encountered an error with the pthread_kill function. The message contains the system error.

Solution:

Save the syslog messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


634957 thr_keycreate failed in init_signal_handlers

Description:

The ucmmd failed in a call to thr_keycreate(3T). ucmmd will produce a core file and will force the node to halt or reboot to avoid the possibility of data corruption.

Solution:

Save a copy of the /var/adm/messages files on all nodes and of the ucmmd core. Contact your authorized Sun service provider for assistance in diagnosing the problem.


636851 INTERNAL ERROR: usage: $0 <gateway_root>

Description:

An internal error has occurred.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


637372 invalid IP address in hosts list: %s

Description:

Solution:


637372 invalid IP address in hosts list: %s

Description:

Solution:


637677 (%s) t_alloc: tli error: %s

Description:

Call to t_alloc() failed. The "t_alloc" man page describes possible error codes. udlmctl will exit.

Solution:

Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.


638868 %s does not exist or is not mounted.


639855 IPMP group %s has status %s. Assuming this node cannot respond to client requests.

Description:

The state of the IPMP group named is degraded.

Solution:

Make sure all adapters and cables are working. Look in the /var/adm/messages file for message from the network monitoring daemon (pnmd).


640029 PENDING_ONLINE: bad resource state <%s> (%d) for resource <%s>

Description:

The rgmd state machine has discovered a resource in an unexpected state on the local node. This should not occur and may indicate an internal logic error in the rgmd.

Solution:

Look for other syslog error messages on the same node. Save a copy of the /var/adm/messages files on all nodes, and report the problem to your authorized Sun service provider.


640087 udlmctl: incorrect command line

Description:

udlmctl will not startup because of incorrect command line options.

Solution:

This is an internal error. Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.


640090 CMM: Initialization for quorum device %s failed with error %d.

Description:

The initialization of the specified quorum device failed with the specified error, and this node will ignore this quorum device.

Solution:

There may be other related messages on this node which may indicate the cause of this problem. Refer to the quorum disk repair section of the administration guide for resolving this problem.


640484 clconf: No valid votecount field for quorum device %d

Description:

Found the votecount field for the quorum device being incorrect while converting the quorum configuration information into quorum table.

Solution:

Check the quorum configuration information.


640799 pmf_alloc_thread: ENOMEM

Description:

The rpc.pmfd server was not able to allocate a new monitor thread, probably due to low memory. As a consequence, the rpc.pmfd server was not able to monitor a process. An error message is output to syslog.

Solution:

Investigate if the machine is running out of memory. If this is not the case, save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


641686 sigaction: %s The rpc.fed server encountered an error with the sigaction function, and was not able to start. The message contains the system error.

Solution:

Save the syslog messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


642678 INTERNAL ERROR: usage: $0 <logicalhost> <server_root> <siebel_enterprise> <siebel_servername>

Description:

An internal error has occurred.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


643472 fatal: Got error <%d> trying to read CCR when enabling resource <%s>; aborting node

Description:

Rgmd failed to read updated resource from the CCR on this node.

Solution:

Save a copy of the /var/adm/messages files on all nodes, and of the rgmd core file. Contact your authorized Sun service provider for assistance in diagnosing the problem.


643802 Resource group is online on more than one node.

Description:

An internal error has occurred. Resource group should be online on only one node.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


644140 Fault monitor is not running.

Description:

Sun cluster tried to stop the fault monitor for this resource, but the fault monitor was not running. This is most likely because the fault monitor was unable to start.

Solution:

Look for prior syslog messages relating to starting of fault monitor and take corrective action. No other action needed


644850 File %s is not readable: %s.

Description:

Unable to open the file in read only mode.

Solution:

Make sure the specified file exists and have correct permissions. For the file name and details, check the syslog messages.


645501 %s initialization failure

Description:

Failed to initialize the hafoip or hascip callback method.

Solution:

Retry the operation. If the error persists, contact your Sun service representative.


646037 Probe timed out.

Description:

The simple probe on the network aware application timed out.

Solution:

This problem may occur when the cluster is under heavy load. You may consider increasing the Probe_timeout property.


646664 Online check Error %s: %ld: %s

Description:

Error detected when checking ONLINE status of RDBMS. Error number is indicated in message. This can be because of RDBMS server problems or configuration problems.

Solution:

Check RDBMS server using vendor provided tools. If server is running properly, this can be fault monitor set-up error.


646815 PCUNSET: %s

Description:

The rpc.pmfd server was not able to monitor a process, and the system error is shown. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


646950 clcomm: Path %s being cleaned up

Description:

A communication link is being removed with another node. The interconnect may have failed or the remote node may be down.

Solution:

Any interconnect failure should be resolved, and/or the failed node rebooted.


647339 (%s) scan of dlmmap failed on "%s", idx =%d

Description:

Failed to scan dlmmap.

Solution:

This is an internal error. Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.


647673 scvxvmlg error - dcs_get_service_parameters() failed, returned %d

Description:

The program responsible for maintaining the VxVM namespace has suffered an internal error. If configuration changes were recently made to VxVM diskgroups or volumes, this node may be unaware of those changes. Recently created volumes may be unaccessible from this node.

Solution:

If no configuration changes have been recently made to VxVM diskgroups or volumes and all volumes continue to be accessible from this node, then no action is required. If changes have been made, the device namespace on this node can be updated to reflect those changes by executing '/usr/cluster/lib/dcs/scvxvmlg'. If the problem persists, contact your authorized Sun service provider to determine whether a workaround or patch is available.


648339 Failed to retrieve ip addresses configured on adapter %s.

Description:

System was attempting to list all the ip addresses configured on the specified adapter, but it was unable to do that.

Solution:

Check the messages that are logged just before this message for possible causes. For more help, contact your authorized Sun service provider with the following information. Output of /var/adm/messages file and the output of "ifconfig -a" command .


648814 loading transport %s failed

Description:

Topology Manager could not load the specified transport module.

Solution:

Check if the transport modules exist with right permissions in the right directories.


649584 Modification of resource group <%s> failed because none of the nodes on which VALIDATE would have run for resource <%s> are currently up

Description:

Before it will permit the properties of a resource group to be edited, the rgmd runs the VALIDATE method on each resource in the group for which a VALIDATE method is registered. For each such resource, the rgmd must be able to run VALIDATE on at least one node. However, all of the candidate nodes are down. "Candidate nodes" are either members of the resource group's Nodelist or members of the resource type's Installed_nodes list, depending on the setting of the resource's Init_nodes property.

Solution:

Boot one of the resource group's potential masters and retry the resource creation operation.


649648 svc_probe used entire timeout of %d seconds during read operation and exceeded the timeout by %d seconds. Attempting disconnect with timeout %d

Description:

The probe timed out while reading from the application.

Solution:

If the problem persists investigate why the application is responding slowly or if the Probe_timeout property needs to be increased.


649860 RGM isn't failing resource group <%s> off of node <%d>, because no current or potential master is healthy enough

Description:

A scha_control(1HA,3HA) GIVEOVER attempt failed on all potential masters, because no candidate node was healthy enough to host the resource group.

Solution:

Examine other syslog messages on all cluster members that occurred about the same time as this message, to see if the problem that caused the MONITOR_CHECK failure can be identified. Repair the condition that is preventing any potential master from hosting the resource.


650276 Failed to get port numbers from config file <%s>.

Description:

An error occurred while parsing the configuration file to extract port numbers.

Solution:

Check that the configuration file path exists and is accessible. Check that port keywords and values exist in the file.


650390 Validation failed. init<sid>.ora file does not exist: %s

Description:

Oracle Parameter file has not been specified. Default parameter file indicated in the message does not exist. Cannot start Oracle server.

Solution:

Please make sure that parameter file exists at the location indicated in message or specify 'Parameter_file' property for the resource. Clear START_FAILED flag on the resource and bring the resource online.


650825 Method <%s> on resource <%s> terminated due to receipt of signal <%d>

Description:

A resource method was terminated by a signal, most likely resulting from an operator-issued kill(1). The method is considered to have failed.

Solution:

No action is required. The operator may choose to issue an scswitch(1M) command to bring resource groups onto desired primaries, or re-try the administrative action that was interrupted by the method failure.


650932 malloc failed for ipaddr string

Description:

Call to malloc failed. The "malloc" man page describes possible reasons.

Solution:

Install more memory, increase swap space or reduce peak memory consumption.


651091 INTERNAL ERROR: Invalid upgrade-from tunablity flag <%d>; aborting node

Description:

A fatal internal error has occurred in the RGM.

Solution:

Since this problem might indicate an internal logic error in the rgmd, please save a copy of the /var/adm/messages files on all nodes, the output of an scstat -g command, and the output of a scrgadm -pvv command. Report the problem to your authorized Sun service provider.


651093 reservation message(%s) - Fencing node %d from disk %s

Description:

The device fencing program is taking access to the specified device away from a non-cluster node.

Solution:

This is an informational message, no user action is needed.


651327 Failed to delete scalable service group %s: %s.

Description:

A call to the underlying scalable networking code failed.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


651865 Failed to stop liveCache gracefully with command %s. Will stop it immediately with db_stop.

Description:

Failed to stop liveCache with 'lcinit stop'. Will shutdown immediately with 'dbmcli db_stop'.

Solution:

Informative message. No user action is needed.


652399 Ignoring the SCHA_ERR_SEQID while retrieving %s

Description:

An update to the cluster configuration tables occured while trying to retrieve certain cluster related information. However, the update does not affect the property that is being retrieved.

Solution:

Ignore the message


653058 Adapter not specified for node %s.


653062 Syntax error on line %s in dfstab file.

Description:

The specified share command is incorrect.

Solution:

Correct the share command using the dfstab(4) man pages.


653183 Unable to create the directory %s: %s. Current directory is /.

Description:

Callback method is failed to create the directory specified. Now the callback methods will be executed in "/", so the core dumps from this callbacks will be located in "/".

Solution:

No user action needed. For detailed error message, check the syslog message.


654520 INTERNAL ERROR: rgm_run_state: bad state <%d> for resource group <%s>

Description:

The rgmd state machine on this node has discovered that the indicated resource group's state information is corrupted. The state machine will not launch any methods on resources in this resource group. This may indicate an internal logic error in the rgmd.

Solution:

Other syslog messages occurring before or after this one might provide further evidence of the source of the problem. If not, save a copy of the /var/adm/messages files on all nodes, and (if the rgmd crashes) a copy of the rgmd core file, and contact your authorized Sun service provider for assistance.


655410 getsockopt: %s

Description:

Solution:


654546 Probe_timeout is not set.

Description:

The resource property Probe_timeout is not set. This property controls the probe time interval.

Solution:

Check whether this property is set. Otherwise, set it using scrgadm(1M).


654567 Failed to retrieve SAP binary path.

Description:

Cannot retrieve the path to SAP binaries.

Solution:

This is an internal error. There may be prior messages in syslog indicating specific problems. Make sure that the system has enough memory and swap space available. Save the /var/adm/messages from all nodes. Contact your authorized Sun service provider.


655416 setsockopt: %s

Description:

Solution:


655512 checkdaemon failed for $HOSTNAME.

Description:

This message if fromthe BV probe. The BV commandcheckdaemon on the specified host failed.

Solution:

No user action needed. The BV Probe will take appropriateaction.


656721 clexecd: %s: sigdelset returned %d. Exiting.

Description:

clexecd program has encountered a failed sigdelset(3C) system call. The error message indicates the error number for the failure.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


656795 CMM: Unable to bind <%s> to nameserver.

Description:

An instance of the userland CMM encountered an internal initialization error.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


657495 Tag %s: error number %d in throttle wait; process will not be requeued.

Description:

An internal error has occured in the rpc.pmfd server while waiting before restarting the specified tag. rpc.pmfd will delete this tag from its tag list and discontinue retry attempts.

Solution:

If desired, restart the tag under pmf using the 'pmfadm -c' command.


657560 CMM: Reading reservations from quorum device %s failed with error %d.

Description:

The specified error was encountered while trying to read reservations on the specified quorum device.

Solution:

There may be other related messages on this and other nodes connected to this quorum device that may indicate the cause of this problem. Refer to the quorum disk repair section of the administration guide for resolving this problem.


657875 Could not reset SCSI buses on CMM reconfiguration. User program did not execute cleanly.

Description:

An error occurred when the SC 3.0 software was in the process of resetting SCSI buses with shared nodes that are down.

Solution:

Look in /var/adm/messages for other messages before this that may help to pinpoint the exact cause of the failure. If no such message is available, then contact your authorized Sun service provider to determine whether a workaround or patch is available.


657885 sigwait: %s

Description:

Solution:


658329 CMM: Waiting for initial handshake to complete.

Description:

The userland CMM has not been able to complete its initial handshake protocol with its counterparts on the other cluster nodes, and will only be able to join the cluster after this is completed.

Solution:

This is an informational message, no user action is needed.


658555 Retrying to retrieve the resource information.

Description:

An update to cluster configuration occured while resource properties were being retrieved

Solution:

Ignore the message.


659665 kill -KILL: %s

Description:

The rpc.fed server is not able to stop a tag that timed out, and the error message is shown. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Examine other syslog messages occurring around the same time on the same node, to see if the cause of the problem can be identified.


659827 CCR: Can't access CCR metadata on node %s errno = %d.

Description:

The indicated error occurred when CCR is trying to access the CCR metadata on the indicated node. The errno value indicates the nature of the problem. errno values are defined in the file /usr/include/sys/errno.h. An errno value of 28(ENOSPC) indicates that the root files system on the node is full. Other values of errno can be returned when the root disk has failed(EIO).

Solution:

There may be other related messages on the node where the failure occurred. These may help diagnose the problem. If the root file system is full on the node, then free up some space by removing unnecessary files. If the root disk on the afflicted node has failed, then it needs to be replaced. If the cluster repository is corrupted, boot the indicated node in -x mode to restore it from backup. The cluster repository is located at /etc/cluster/ccr/.


660332 launch_validate: fe_set_env_vars() failed for resource <%s>, resource group <%s>, method <%s>

Description:

The rgmd was unable to set up environment variables for method execution, causing a VALIDATE method invocation to fail. This in turn will cause the failure of a creation or update operation on a resource or resource group.

Solution:

Examine other syslog messages occurring at about the same time to see if the problem can be identified. Re-try the creation or update operation. If the problem recurs, save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance.


660368 CCR: CCR service not available, service is %s.

Description:

The CCR service is not available due to the indicated failure.

Solution:

Reboot the cluster. Also contact your authorized Sun service provider to determine whether a workaround or patch is available.


660974 file specified in USER_ENV %s does not exist

Description:

'User_env' property was set when configuring the resource. File specified in 'User_env' property does not exist or is not readable. File should be specified with fully qualified path.

Solution:

Specify existing file with fully qualified file name when creating resource. If resource is already created, please update resource property 'User_env'.


661560 All the SUNW.HAStoragePlus resources that this resource depends on are online on the local node. Proceeding with the checks for the existence and permissions of the start/stop/probe commands.

Description:

This is an informational message which means that the SUNW.HAStoragePlus resource(s) that this application resource depends on is online on the local node and therefore the validation checks related to start/stop/probe commands will be carried out on the local node.

Solution:

None.


661614 Method <%s> failed on resource <%s> in resource group <%s>, exit code <%d>

Description:

A resource method exited with a non-zero exit code; this is considered a method failure. Depending on which method is being invoked and the Failover_mode setting on the resource, this might cause the resource group to fail over or move to an error state.

Solution:

Consult resource type documentation to diagnose the cause of the method failure. Other syslog messages occurring just before this one might indicate the reason for the failure. After correcting the problem that caused the method to fail, the operator may choose to issue an scswitch(1M) command to bring resource groups onto desired primaries.


661778 clcomm: memory low: freemem 0x%x

Description:

The system is reporting that the system has a very low level of free memory.

Solution:

If the system fails soon after this message, then there is a significantly greater chance that the system ran out of memory. In which case either install more memory or reduce system load. When the system continues to function, this means that the system recovered and no user action is required.


661084 liveCache was stopped by the user outside of Sun Cluster. Sun Cluster will suspend monitoring until liveCache is again started up successfully outside of Sun Cluster.

Description:

When Sun Cluster tries to bring up liveCache, it detects that liveCache was brought down by user intendedly outside of Sun Cluster. Suu Cluster will not try to restart it under the control of Sun Cluster until liveCache is started up successfully again by the user. This behaviour is enforced across nodes in the cluster.

Solution:

Informative message. No action is needed.


661782 Could not clear stale entries in the orbixd checkpoint file $ORBIXD_CHECKPOINT_FILE

Description:

This is an internal error.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider forassistance in diagnosing the problem.


662056 Failed to shutdown lockd gracefully.


662516 SIOCGLIFNUM: %s

Description:

Solution:


663089 clexecd: %s: sigwait returned %d. Exiting.

Description:

clexecd program has encountered a failed sigwait(3C) system call. The error message indicates the error number for the failure.

Solution:

The clexecd program will exit and the node will be halted or rebooted to prevent data corruption. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


663293 reservation error(%s) - do_status() error for disk %s

Description:

The device fencing program has encountered errors while trying to access a device. All retry attempts have failed.

Solution:

The action which failed is a scsi-2 ioctl. These can fail if there are scsi-3 keys on the disk. To remove invalid scsi-3 keys from a device, use 'scdidadm -R' to repair the disk (see scdidadm man page for details). If there were no scsi-3 keys present on the device, then this error is indicative of a hardware problem, which should be resolved as soon as possible. Once the problem has been resolved, the following actions may be necessary: If the message specifies the 'node_join' transition, then this node may be unable to access the specified device. If the failure occurred during the 'release_shared_scsi2' transition, then a node which was joining the cluster may be unable to access the device. In either case, access can be reacquired by executing '/usr/cluster/lib/sc/run_reserve -c node_join' on all cluster nodes. If the failure occurred during the 'make_primary' transition, then a device group may have failed to start on this node. If the device group was started on another node, it may be moved to this node with the scswitch command. If the device group was not started, it may be started with the scswitch command. If the failure occurred during the 'primary_to_secondary' transition, then the shutdown or switchover of a device group may have failed. If so, the desired action may be retried.


663835 in libsecurity creat of file %s failed: %s

Description:

The rpc.pmfd, rpc.fed or rgmd server was not able to create a cache file for rpcbind information. The affected component should continue to function by calling rpcbind directly.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


663851 Failover %s data services must have exactly one value for extension property %s.

Description:

Failover data services must have one and only one value for Confdir_list.

Solution:

Create a failover resource group for each configuration file.


663897 clcomm: Endpoint %p: %d is not an endpoint state

Description:

The system maintains information about the state of an Endpoint. The Endpoint state is invalid.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


663943 Quorum: Unable to reset node information on quorum disk.

Description:

This node was unable to reset some information on the quorum device. This will lead the node to believe that its partition has been preempted. This is an internal error. If a cluster gets divided into two or more disjoint subclusters, exactly one of these must survive as the operational cluster. The surviving cluster forces the other subclusters to abort by grabbing enough votes to grant it majority quorum. This is referred to as preemption of the losing subclusters.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


664371 %s: Not in cluster mode. Exiting...

Description:

Must be in cluster mode to execute this command.

Solution:

Reboot into cluster mode and retry the command.


665015 Scalable service instance [%s,%s,%d] registered on node %s.

Description:

The specified scalable service has been registered on the specified node. Now, the gif node can redirect packets for the specified service to this node.

Solution:

This is an informational message, no user action is needed.


665195 INTERNAL ERROR: rebalance: invalid node name in Nodelist of resource group <%s>

Description:

An internal error has occurred in the rgmd. This error may prevent the rgmd from bringing the affected resource group online.

Solution:

Look for other syslog error messages on the same node. Save a copy of the /var/adm/messages files on all nodes, and report the problem to your authorized Sun service provider.


665297 Failed to validate BV configuration.

Description:

The Validation of the BV extension properties or Broadvision configuration has failed.

Solution:

Look for other error messages generated while validatingthe extension properties or Broadvision configuration toidentify the exact error. Look for appropriate action for that error message.


665845 Node list for resource group %s is empty.


665931 Initialization error. CONNECT_STRING is NULL

Description:

Error occured in monitor initialization. Monitor is unable to get resource property 'Connect_string'.

Solution:

Check syslog messages for errors logged from other system modules. Check the resource configuration and value of 'Connect_string' property. Check syslog messages for errors logged from other system modules. Stop and start fault monitor. If error persists then disable fault monitor and report the problem.


666391 clcomm: invalid invocation result status %d

Description:

An invocation completed with an invalid result status.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


666443 unix DLM already running

Description:

UNIX DLM is already running. Another dlm will not be started.

Solution:

This is an internal error. Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.


666603 clexecd: Error %d in fcntl(F_GETFD). Exiting.

Description:

clexecd program has encountered a failed fcntl(2) system call. The error message indicates the error number for the failure.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


667020 Invalid shared path

Description:

HA-NFS fault monitor detected that one or more shared paths in dftab are invalid paths.

Solution:

Make sure all paths in dfstab are correct. Look at the prior syslog messages for any specific problems and correct them.


667429 Multiple entries in %s have the same port number: %d.

Description:

Multiple entries in the specified property have the same port number.

Solution:

Remove one of the entries or change its port number.


668866 Successful shutdown; terminating daemon

Description:

Solution:


669026 fcntl(F_SETFD) failed in close_on_exec

Description:

A fcntl operation failed. The "fcntl" man page describes possible error codes.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


670753 reservation fatal error(%s) - unable to determine node id for node %s

Description:

The device fencing program has suffered an internal error.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available. Copies of /var/adm/messages from all nodes should be provided for diagnosis. It may be possible to retry the failed operation, depending on the nature of the error. If the message specifies the 'node_join' transition, then this node may be unable to access shared devices. If the failure occurred during the 'release_shared_scsi2' transition, then a node which was joining the cluster may be unable to access shared devices. In either case, it may be possible to reacquire access to shared devices by executing '/usr/cluster/lib/sc/run_reserve -c node_join' on all cluster nodes. If the failure occurred during the 'make_primary' transition, then a device group has failed to start on this node. If another node was available to host the device group, then it should have been started on that node. If desired, it may be possible to switch the device group to this node with the scswitch command. If no other node was available, then the device group will not have been started. The scswitch command may be used to retry the attempt to start the device group. If the failure occurred during the 'primary_to_secondary' transition, then the shutdown or switchover of a device group has failed. The desired action may be retried.


670799 CMM: Registering reservation key on quorum device %s failed with error %d.

Description:

The specified error was encountered while trying to place the local node's reservation key on the specified quorum device. This node will ignore this quorum device.

Solution:

There may be other related messages on this and other nodes connected to this quorum device that may indicate the cause of this problem. Refer to the quorum disk repair section of the administration guide for resolving this problem.


670839 Staring liveCache times out with command %s.

Description:

Starting liveCache timed out.

Solution:

Look for syslog error messages on the same node. Save a copy of the /var/adm/messages files on all nodes, and report the problem to your authorized Sun service provider.


671954 waitpid: %s

Description:

The rpc.pmfd or rpc.fed server was not able to wait for a process. The message contains the system error. The server does not perform the action requested by the client, and an error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


672013 Waiting for orbixd to start.

Description:

Just an informational message that the method will wait tillthe orbixd starts up.

Solution:

No action needed.


672019 Stop method failed. Error: %d.

Description:

Stop method failed, while attempting to restart the data service.

Solution:

Check the Stop_timeout and adjust it if it is not appropriate. For the detailed explanation of failure, check the syslog messages that occurred just before this message.


672372 dl_attach: bad ACK header %u

Description:

Could not attach to the physical device. We are trying to open a fast path to the private transport adapters.

Solution:

Reboot of the node might fix the problem.


672511 Failed to start Text server.

Description:

Sun Cluster HA for Sybase failed to start the text server. Other syslog messages and the log file will provide additional information on possible reasons for the failure.

Solution:

Please whether the server can be started manually. Examine the HA-Sybase log files, text server log files and setup.


674359 load balancer deleted

Description:

This message indicates that the service group has been deleted.

Solution:

This is an informational message, no user action is needed.


674415 svc_restore_priority: %s

Description:

The rpc.pmfd or rpc.fed server was not able to run the application in the correct scheduling mode, and the system error is shown. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


674848 fatal: Failed to read CCR

Description:

The rgmd is unable to read the cluster configuration repository. This is a fatal error. The rgmd will produce a core file and will force the node to halt or reboot to avoid the possibility of data corruption.

Solution:

Save a copy of the /var/adm/messages files on all nodes, and of the rgmd core file. Contact your authorized Sun service provider for assistance in diagnosing the problem.


675221 clcomm:Cannot fork1() after ORB server initialization.

Description:

A user level process attempted to fork1 after ORB server initialization. This is not allowed.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


675432 pmf_monitor_suspend: poll: %s

Description:

The rpc.pmfd server was not able to monitor a process. and the system error is shown. This error occurred for a process whose monitoring had been suspended. The monitoring of this process has been aborted and can not be resumed.

Solution:

Save the syslog messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


675776 Stopped the fault monitor.

Description:

The fault monitor for this data service was stopped successfully.

Solution:

No action needed.


676141 in libsecurity could not copy host name

Description:

A client was not able to make an rpc connection to a server (rpc.pmfd, rpc.fed or rgmd) because the host name could not be saved, probably due to low memory. An error message is output to syslog.

Solution:

Investigate if the host is low on memory. If not, save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


676558 WARNING: Global_resources_used property of resource group <%s> is set to non-null string, assuming wildcard

Description:

The Global_resources_used property of the resource group was set to a specific non-null string. The only supported settings of this property in the current release are null ("") or wildcard ("*").

Solution:

No user action is required; the rgmd will interpret this value as wildcard. This means that method timeouts for this resource group will be suspended while any device group temporarily goes offline during a switchover or failover. This is usually the desired setting, except when a resource group has no dependency on any global device service or pxfs file system.


677278 No network address resource in resource group.

Description:

A resource has no associated network address.

Solution:

For a failover data service, add a network address resource to the resource group. For a scalable data service, add a network resource to the resource group referenced by the RG_dependencies property.


677428 %s can't UP %s

Description:

This means that the Logical IP address could not be set to UP.

Solution:

There could be other related error messages which might be helpful. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


677759 Unknown status code %d.

Description:

This message indicates that an unknown status code was returned by one of the underlying subsystems and an internal error has occured.

Solution:

Report this problem.


678041 lkcm_sync: cm_reconfigure failed: %s

Description:

ucmm reconfiguration failed.

Solution:

This is an internal error. Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.


678319 (%s) getenv of "%s" failed.

Description:

Failed to get the value of an environmental variable. udlm will fail to go through a transition.

Solution:

This is an internal error. Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.


678755 dl_bind: DL_BIND_ACK protocol error

Description:

Could not bind to the physical device. We are trying to open a fast path to the private transport adapters.

Solution:

Reboot of the node might fix the problem.


679912 uaddr2taddr: %s

Description:

Call to uaddr2taddr() failed. The "uaddr2taddr" man page describes possible error codes. udlm will exit and the node will abort and panic.

Solution:

This is an internal error. Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.


680437 Start method failed. Error: %d.

Description:

Restart of the data service failed.

Solution:

Check the sylog messages that are occurred just before this message to check whether there is any internal error. In case of internal error, contact your Sun service provider. Otherwise, any of the following situations may have happened. 1) Check the Start_timeout value and adjust it if it is not appropriate. 2) Check whether the application's configuration is correct. 3) This might be the result of lack of the system resources. Check whether the system is low in memory or the process table is full and take appropriate action.


680675 clcomm: thread_create failed for monitor

Description:

The system could not create the needed thread, because there is inadequate memory.

Solution:

There are two possible solutions. Install more memory. Alternatively, reduce memory usage. Since this happens during system startup, application memory usage is normally not a factor.


680960 Unable to write data: %s.

Description:

Failed to write the data to the socket. The reason might be expiration of timeout, hung application or heavy load.

Solution:

Check if the application is hung. If this is the case, restart the application.


681547 fatal: Method <%s> on resource <%s>: Received unexpected result <%d> from rpc.fed, aborting node

Description:

A serious error has occurred in the communication between rgmd and rpc.fed. The rgmd will produce a core file and will force the node to halt or reboot to avoid the possibility of data corruption.

Solution:

Save a copy of the /var/adm/messages files on all nodes, and of the rgmd core file. Contact your authorized Sun service provider for assistance in diagnosing the problem.


682887 CMM: Initialization for quorum device %s failed with error EACCES. Will issue a SCSI2 Tkown and retry.

Description:

This node is not able to access the specified quorum device because the node is still fenced off. A retry will be attempted.

Solution:

This is an informational message, no user action is needed.


683763 TCPTR: Attempt to join from remote node %u that has incompatible cluster software. \"%s\" on node %u not compatible with \"%s\" on node %u.

Description:

Tranport at the local node received an initial handshake message from the remote node that is not running a compatible version of the suncluster software.

Solution:

Make sure all nodes in the cluster are running compatible versions of sun cluster software.


683997 Failed to retrieve the resource group property %s: %s

Description:

Unable to retrieve the resource group property.

Solution:

For the property name and the reason for failure, check the syslog message. For more details about the api failure, check the syslog messages from the RGM .


684383 Development system shut down successfully.

Description:

Informational message.

Solution:

No action needed.


684753 store_binding: <%s> bad bind type <%d>

Description:

During a name server binding store an unknown binding type was encountered.

Solution:

No action required. This is informational message.


684895 Failed to validate scalable service configuration: Error %d.

Description:

An error was detected in the Load_balancing_weights property for the data service.

Solution:

Use the scrgadm command to change the Load_balancing_weights property to a valid value.


685886 Failed to communicate: %s.

Description:

While determining the health of the data service, fault monitor is failed to communicate with the process monitor facility.

Solution:

This is internal error. Save /var/adm/messages file and contact your authorized Sun service provider. For more details about error, check the syslog messges.


686220 Node attempted to join with invalid version message

Description:

Initial handshake message from a cluster node did not have a valid format.

Solution:

Check if all cluster nodes are running the same version of the clustering software.


687457 Attempting to kill pid %d name %s resulted in error: %s.

Description:

HA-NFS callback method attempted to stop the specified NFS process with SIGKILL but was unable to do so because of the specified error.

Solution:

The failure of the method would be handled by SunCluster. If the failure happened during starting of a HA-NFS resource, the resource would be failed over to some other node. If this happened during stopping, the node would be rebooted and HA-NFS service would continue on some other node. If this error persists, please contact your local SUN service provider for assistance.


687543 shutdown abort did not succeed.

Description:

HA-Oracle failed to shutdown Oracle server using 'shutdown abort'.

Solution:

Examine log files and syslog messages to determine the cause of failure.


687929 daemon %s did not respond to null rpc call: %s.

Description:

HA-NFS fault monitor failed to ping an nfs daemon.

Solution:

No action required. The fault monitor will restart the daemon if necessary.


688163 clexecd: pipe returned %d. Exiting.

Description:

clexecd program has encountered a failed pipe(2) system call. The error message indicates the error number for the failure.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


688525 $daemon_status daemons are not running on $HOSTNAME

Description:

This message is from the BV Probe. The probe detected thatthe specified number of daemons are not running.

Solution:

No user action needed. The BV Probe will take appropriateaction.


689075 Failed to delete scalable services group: Error %d.


689538 Listener %s did not stop.(%s)

Description:

Failed to start Oracle listener using 'lsnrctl' command. HA-Oracle will attempt to kill listener process.

Solution:

None


689887 Failed to stop the process with: %s. Retry with SIGKILL.

Description:

Process monitor facility is failed to stop the data service. It is reattempting to stop the data service.

Solution:

This is informational message. Check the Stop_timeout and adjust it, if it is not appropriate value.


689989 Invalid device group name <%s> supplied

Description:

The diskgroup name defined in SUNW.HAStorage type resource is invalid

Solution:

Check and set the correct diskgroup name in extension property "ServicePaths" of SUNW.HAStorage type resource.


690417 Protocol is missing in system defined property %s.

Description:

The specified system property does not have a valid format. The value of the property must include a protocol.

Solution:

Use scrgadm(1M) to specify the property value with protocol. For example: TCP.


690463 Cannot bring server online on this node.

Description:

Oracle server is running but it cannot be brought online on this node. START method for the resource has failed.

Solution:

Check if Oracle server can be started manually. Examine the log files and setup. Clear START_FAILED flag on the resource and bring the resource online.


691493 One or more of the SUNW.HAStoragePlus resources that this resource depends on is in a different resource group.

Description:

It is an invalid configuration to have an application resource depend on one or more SUNW.HAStoragePlus resource(s) that are in a different resource group.

Solution:

Change the resource/resource group configuration such that the appliaction resource and the SUNW.HAStoragePlus resource(s) are in the same resource group.


691736 CMM: Quorum device %ld (%s) with votecount = %d removed.

Description:

The specified quorum device with the specified votecount has been removed from the cluster. A quorum device being placed in maintenance state is equivalent to it being removed from the quorum subsystem's perspective, so this message will be logged when a quorum device is put in maintenance state as well as when it is actually removed.

Solution:

This is an informational message, no user action is needed.


692203 Failed to stop development system.

Description:

Stopping the development system failed.

Solution:

Informational message. Check previous messages in the system log for more details regarding why it failed.


695728 Skipping checks dependant on HAStoragePlus resources on this node.

Description:

This resource will not perform some filesystem specific checks (during VALIDATE or MONITOR_CHECK) on this node because atleast one SUNW. HAStoragePlus resource that it depends on is online on some other node.

Solution:

None.


696186 This list element in System property %s has an invalid port number: %s.

Description:

The system property that was named does not have a valid port number.

Solution:

Change the value of the property to use a valid port number.


696463 rgm_clear_util called on resource <%s> with incorrect flag <%d>

Description:

An internal rgmd error has occurred while attempting to carry out an operator request to clear an error flag on a resource. The attempted clear action will fail.

Solution:

Since this problem might indicate an internal logic error in the rgmd, please save a copy of the /var/adm/messages files on all nodes, the output of an scstat -g command, and the output of a scrgadm -pvv command. Report the problem to your authorized Sun service provider.


697026 did instance %d created.

Description:

Informational message from scdidadm.

Solution:

No user action required.


697108 t_sndudata in send_reply failed.

Description:

Call to t_sndudata() failed. The "t_sndudata" man page describes possible error codes. ucmmd will exit and the node will abort.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


697588 Nodeid must be less than %d. Nodeid passed: '%s'

Description:

Incorrect nodeid passed to Oracle unix dlm. Oracle unix dlm will not start.

Solution:

This is an internal error. Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.


697663 INTERNAL ERROR:BV extension property structure is NULL.

Description:

An internal error has occurred.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider forassistance in diagnosing the problem.


697725 Error retrieving result of bind to %s port %d for non-secure resource %s: %s

Description:

An error occurred while fault monitor attempted to probe the health of the data service.

Solution:

Wait for the fault monitor to correct this by doing restart or failover. For more error description, look at the syslog messages.


698239 Monitor server stopped.

Description:

The Monitor server has been stopped by Sun Cluster HA for Sybase.

Solution:

This is an information message, no user action is needed.


698512 Directory %s is not readable: %s.

Description:

The specified path doesn't exist or is not readable

Solution:

Consult the HA-NFS configuration guide on how to configure the dfstab.<resource_name> file for HA-NFS resources.


698526 scvxvmlg error - service %s has service_class %s, not %s, ignoring it

Description:

The program responsible for maintaining the VxVM namespace has suffered an internal error. If configuration changes were recently made to VxVM diskgroups or volumes, this node may be unaware of those changes. Recently created volumes may be unaccessible from this node.

Solution:

If no configuration changes have been recently made to VxVM diskgroups or volumes and all volumes continue to be accessible from this node, then no action is required. If changes have been made, the device namespace on this node can be updated to reflect those changes by executing '/usr/cluster/lib/dcs/scvxvmlg'. If the problem persists, contact your authorized Sun service provider to determine whether a workaround or patch is available.


698744 scvxvmlg error - lstat(%s) failed with errno %d

Description:

The program responsible for maintaining the VxVM namespace was unable to access the global device namespace. If configuration changes were recently made to VxVM diskgroups or volumes, this node may be unaware of those changes. Recently created volumes may be unaccessible from this node.

Solution:

Verify that the /global/.devices/node@N (N = this node's node number) is mounted globally and is accessible. If no configuration changes have been recently made to VxVM diskgroups or volumes and all volumes continue to be accessible from this node, then no further action is required. If changes have been made, the device namespace on this node can be updated to reflect those changes by executing '/usr/cluster/lib/dcs/scvxvmlg'. If the problem persists, contact your authorized Sun service provider to determine whether a workaround or patch is available.


699689 (%s) poll failed: %s (UNIX errno %d)

Description:

Call to poll() failed. The "poll" man page describes possible error codes. udlmctl will exit.

Solution:

Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.

Message IDs 700000–799999


700161 Fault monitor is already running.

Description:

The resource's fault monitor is already running.

Solution:

This is an internal error. Save the /var/adm/messages file from all the nodes. Contact your authorized Sun service provider.


700321 exec() of %s failed: %m.

Description:

The exec() system call failed for the given reason.

Solution:

Verify that the pathname given is valid.


701136 Failed to stop monitor server.

Description:

Sun Cluster HA for Sybase failed to stop monitor server using KILL signal.

Solution:

Please examine whether any Sybase server processes are running on the server. Please manually shutdown the server.


701567 Unable to bind door %s: %s

Description:

Solution:


702148 internal error


703156 scha_control GIVEOVER failed with error code: %s

Description:

Fault monitor had detected problems in Oracle listener. Attempt to switchover resource to another node failed. Error returned by API call scha_control is indicated in the message.

Solution:

Check Oracle listener setup. Please make sure that Listener_name specified in the resource property is configured in listener.ora file. Check 'Host' property of listener in listener.ora file. Examine log file and syslog messages for additional information.


703450 Despite the warnings, the validation of the hostname list succeeded


703476 clcomm: unable to create desired unref threads

Description:

The system was unable to create threads that deal with no longer needed objects. The system fails to create threads when memory is not available. This message can be generated by the inability of either the kernel or a user level process. The kernel creates unref threads when the cluster starts. A user level process creates threads when it initializes.

Solution:

Take steps to increase memory availability. The installation of more memory will avoid the problem with a kernel inability to create threads. For a user level process problem: install more memory, increase swap space, or reduce the peak work load.


703553 Resource group name or resource name is too long.

Description:

Process monitor facility is failed to execute the command. Resource group name or resource name is too long for the process monitor facility command.

Solution:

Check the resource group name and resource name. Give short name for resource group or resource .


703744 reservation fatal error(%s) - get_cluster_state() exception

Description:

The device fencing program has suffered an internal error.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available. Copies of /var/adm/messages from all nodes should be provided for diagnosis. It may be possible to retry the failed operation, depending on the nature of the error. If the message specifies the 'node_join' transition, then this node may be unable to access shared devices. If the failure occurred during the 'release_shared_scsi2' transition, then a node which was joining the cluster may be unable to access shared devices. In either case, it may be possible to reacquire access to shared devices by executing '/usr/cluster/lib/sc/run_reserve -c node_join' on all cluster nodes. If the failure occurred during the 'make_primary' transition, then a device group has failed to start on this node. If another node was available to host the device group, then it should have been started on that node. If desired, it may be possible to switch the device group to this node with the scswitch command. If no other node was available, then the device group will not have been started. The scswitch command may be used to retry the attempt to start the device group. If the failure occurred during the 'primary_to_secondary' transition, then the shutdown or switchover of a device group has failed. The desired action may be retried.


704795 in libsecurity could not negotiate uid on any transport in NETPATH

Description:

A server (rpc.pmfd, rpc.fed or rgmd) was not able to start because it could not establish a rpc connection for the network specified, because it couldn't find any transport. This happened because either there are no available transports at all, or there are but none is a loopback. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


705163 load balancer thread failed to start for %s

Description:

The system has run out of resources that is required to create a thread. The system could not create the load balancer thread.

Solution:

The service group is created with the default load balancing policy. If rebalancing is required, free up resources by shutting down some processes. Then delete the service group and re-create it.


705629 clutil: Can't allocate hash table

Description:

The system attempted unsuccessfully to allocate a hash table. There was insufficient memory.

Solution:

Install more memory, increase swap space, or reduce peak memory consumption.


705693 listen: %s

Description:

Solution:


706159 Failed to switchover resource group %s: %s

Description:

An attempt to switchover the specified resource group failed. The reason for the failure is logged.

Solution:

Look for the message indicating the reason for this failure. This should help in the diagnosis of the problem.


706314 clexecd: Error %d from open(/dev/zero). Exiting.

Description:

clexecd program has encountered a failed open(2) system call. The error message indicates the error number for the failure.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


707421 %s: Cannot create a thread.

Description:

Solaris has run out of its limit on threads. Either too many clients are requesting a service, causing many threads to be created at once or system is overloaded with processes.

Solution:

Reduce system load by reducing number of requestors of this service or halting other processes on the system.


707881 clcomm: thread_create failed for autom_thread

Description:

The system could not create the needed thread, because there is inadequate memory.

Solution:

There are two possible solutions. Install more memory. Alternatively, reduce memory usage. Since this happens during system startup, application memory usage is normally not a factor.


707948 launching method <%s> for resource <%s>, resource group <%s>, timeout <%d> seconds

Description:

RGM has invoked a callback method for the named resource, as a result of a cluster reconfiguration, scha_control GIVEOVER, or scswitch.

Solution:

This is an informational message, no user action is needed.


708422 Command {%s} failed: %s.

Description:

The command noted did not return the expected value. Additional information may be found in the error message after the ':', or in subsequent messages in syslog.

Solution:

This message is issued from a general purpose routine. Appropriate action may be indicated by the additional information in the message or in syslog.


708825 Failed to validate IPMP group name <%s> pnm errorcode <%d>.


709082 "pmfadm -k": Can not signal <%s>: Monitoring is not resumed on pid %d

Description:

The command 'pmfadm -k' can not be executed on the given tag because the monitoring is suspended on the indicated pid.

Solution:

Resume the monitoring on the indicated pid with the 'pmfctl -R' command.


710143 Failed to add node %d to scalable service group %s: %s.

Description:

A call to the underlying scalable networking code failed.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


711956 open /dev/ip failed: %s.

Description:

System was attempting to open the specified device, but was unable to do so.

Solution:

This might be the result of lack of the system resources. Check whether the system is low in memory and take appropriate action. For specific error information check the syslog message.


712367 clcomm: Endpoint %p: deferred task not allowed in state %d

Description:

The system maintains information about the state of an Endpoint. A deferred task is not allowed in this state.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


712591 Validation failed. Resource group property FAILBACK must be FALSE

Description:

The resource being created or modified must belong to a group that must have a value of FALSE for it's FAILBACK property.

Solution:

Specify FALSE for the FAILBACK property.


713428 Confdir_list must be an absolute path.

Description:

The entries in Confdir_list must be an absolute path (start with '/').

Solution:

Create the resource with absolute paths in Confdir_list.


714123 Stopping the backup server.

Description:

The backup server is about to be brought down by Sun Cluster HA forSybase.

Solution:

This is an information message, no user action is needed.


714173 Load balancer setting distribution.


714208 Starting liveCache timed out with command %s.

Description:

Starting liveCache timed out.

Solution:

Look for syslog error messages on the same node. Save a copy of the /var/adm/messages files on all nodes, and report the problem to your authorized Sun service provider.


714838 reservation fatal error(%s) - Unable to open name file '%s', errno %d

Description:

The device fencing program has suffered an internal error.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available. Copies of /var/adm/messages from all nodes should be provided for diagnosis. It may be possible to retry the failed operation, depending on the nature of the error. If the message specifies the 'node_join' transition, then this node may be unable to access shared devices. If the failure occurred during the 'release_shared_scsi2' transition, then a node which was joining the cluster may be unable to access shared devices. In either case, it may be possible to reacquire access to shared devices by executing '/usr/cluster/lib/sc/run_reserve -c node_join' on all cluster nodes. If the failure occurred during the 'make_primary' transition, then a device group has failed to start on this node. If another node was available to host the device group, then it should have been started on that node. If desired, it may be possible to switch the device group to this node with the scswitch command. If no other node was available, then the device group will not have been started. The scswitch command may be used to retry the attempt to start the device group. If the failure occurred during the 'primary_to_secondary' transition, then the shutdown or switchover of a device group has failed. The desired action may be retried.


715958 Method <%s> on resource <%s> stopped due to receipt of signal <%d>

Description:

A resource method was stopped by a signal, most likely resulting from an operator-issued kill(1). The method is considered to have failed.

Solution:

The operator must kill the stopped method. The operator may then choose to issue an scswitch(1M) command to bring resource groups onto desired primaries, or re-try the administrative action that was interrupted by the method failure.


716023 BV1TO1 variable not set.

Description:

The BV1TO1 variable is not configured in bv1to1.conf file.

Solution:

Reconfigure the Broadvision site properly with proper BV1TO1value.


716253 launch_fed_prog: fe_set_env_vars() failed for program <%s>, step <%s>

Description:

The ucmmd server was not able to get the locale environment. An error message is output to syslog.

Solution:

Investigate if the host is running out of memory. If not save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


718325 Failed to stop development system within %d seconds. Will continue to stop the development system in the background. Meanwhile, the production system Central Instance is started up now.

Description:

Failed to shutdown the development system within the timeout period. It will be continuously shutting down in the background. Meanwhile, the Central instance will be started up.

Solution:

No action needed. You might consider increasing the Dev_stop_pct property or Start_timeout property.


718457 Dispatcher Process is not running. pid was %d

Description:

The main dispatcher process is not present in the process list indicating the main dispatcher is not running on this node.

Solution:

No action needed. Fault monitor will detect that the main dispatcher process is not running, and take appropriate action.


719114 Failed to parse key/value pair from command line for %s.

Description:

The validate method for the scalable resource network configuration code was unable to convert the property information given to a usable format.

Solution:

Verify the property information was properly set when configuring the resource.


719497 clcomm: path_manager using RT lwp rather than clock interrupt

Description:

The system has been built to use a real time thread to support path_manager heart beats instead of the clock interrupt.

Solution:

No user action is required.


719682 fopen: %s

Description:

Solution:


719997 Failed to pre-allocate swap space

Description:

The pmfd, fed, or other program was not able to allocate swap space. This means that the machine is low in swap space. The server does not come up, and an error message is output to syslog.

Solution:

Investigate if the machine is running out of swap. If this is not the case, save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


721252 cm2udlm: cm_getclustmbyname: %s

Description:

Could not create a structure for communication with the cluster monitor process.

Solution:

This is an internal error. Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.


721263 Extension property <stop_signal> has a value of <%d>

Description:

Resource property stop_signal is set to a value or has a default value.

Solution:

This is an informational message, no user action is needed.


721650 Siebel server not running.

Description:

Siebel server may not be running.

Solution:

This is an informative message. Fault Monitor should either restart or failover the Siebel server resource. This message may also be generated during the start method while waiting for the service to come up.


721881 dl_attach: kstr_msg failed %d error

Description:

Could not attach to the private interconnect.

Solution:

Reboot of the node might fix the problem.


722270 fatal: cannot create state machine thread

Description:

The rgmd was unable to create a thread upon starting up. This is a fatal error. The rgmd will produce a core file and will force the node to halt or reboot to avoid the possibility of data corruption.

Solution:

Make sure that the hardware configuration meets documented minimum requirements. Save a copy of the /var/adm/messages files on all nodes, and of the rgmd core file. Contact your authorized Sun service provider for assistance in diagnosing the problem.


722439 Restarting using scha_control RESOURCE_RESTART

Description:

Fault monitor has detected problems in RDBMS server. Attempt will be made to restart RDBMS server on the same node.

Solution:

Check the cause of RDBMS failure.


722768 %s: could not get network addresses.

Description:

The daemon is unable to get net addresses of itself and caller.


722904 Failed to open the resource group handle: %s.

Description:

An API operation has failed while retrieving the resource group property. Low memory or API call failure might be the reasons.

Solution:

In case of low memory, the problem will probably cured by rebooting. If the problem reoccurs, you might need to increase swap space by configuring additional swap devices. Otherwise, if it is API call failure, check the syslog messages from other components. For resource group name and the property name, check the current syslog message.


722984 call to rpc.fed failed for resource <%s>, resource group <%s>, method <%s>

Description:

The rgmd failed in an attempt to execute a method, due to a failure to communicate with the rpc.fed daemon. Depending on which method was being invoked and the Failover_mode setting on the resource, this might cause the resource group to fail over or move to an error state. If the rpc.fed process died, this might lead to a subsequent reboot of the node.

Solution:

Examine other syslog messages occurring at about the same time to see if the problem can be identified. Save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance in diagnosing the problem.


723206 SAP is already running.

Description:

SAP is already running either locally on this node or remotely on a different node in the cluster outside of the control of the Sun Cluster.

Solution:

Need to shut down SAP first, before start up SAP under the control of Sun Cluster.


724035 Failed to connect to %s secure port %d.

Description:

An error occured while the fault monitor was trying to connect to a secure port specified in the Port_list property for this resource.

Solution:

Check to make sure that the Port_list property is correctly set to the same port number that the Netscape Directory Server is running on.


726004 Invalid timeout value %d passed.

Description:

Failed to execute the command under the specified timeout. The specified timeout is invalid.

Solution:

Respecify a positive, non-zero timeout value.


726417 read %d for %sport

Description:

Could not get the port information from config file udlm.conf.

Solution:

Check to make sure udlm.conf file exist and has entry for udlm.port. If everything looks normal and the problem persists, contact your Sun service representative.


727065 CMM: Enabling failfast on quorum device %s failed with error %d.

Description:

An attempt to enable failfast on the specified quorum device failed with the specified error.

Solution:

Check if the specified quorum disk has failed. This message may also be logged when a node is booting up and has been preempted from the cluster, in which case no user action is necessary.


727160 msg of wrong version %d, expected %d

Description:

udlmctl received an illegal message.

Solution:

None. udlm will handle this error.


728216 reservation error(%s) - did_get_path() error

Description:

The device fencing program has suffered an internal error.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available. Copies of /var/adm/messages from all nodes should be provided for diagnosis. It may be possible to retry the failed operation, depending on the nature of the error. If the message specifies the 'node_join' transition, then this node may be unable to access shared devices. If the failure occurred during the 'release_shared_scsi2' transition, then a node which was joining the cluster may be unable to access shared devices. In either case, it may be possible to reacquire access to shared devices by executing '/usr/cluster/lib/sc/run_reserve -c node_join' on all cluster nodes. If the failure occurred during the 'make_primary' transition, then a device group has failed to start on this node. If another node was available to host the device group, then it should have been started on that node. If desired, it may be possible to switch the device group to this node with the scswitch command. If no other node was available, then the device group will not have been started. The scswitch command may be used to retry the attempt to start the device group. If the failure occurred during the 'primary_to_secondary' transition, then the shutdown or switchover of a device group has failed. The desired action may be retried.


728425 INTERNAL ERROR: bad state <%s> (%d) for resource group <%s> in rebalance()

Description:

An internal error has occurred in the rgmd. This may prevent the rgmd from bringing the affected resource group online.

Solution:

Look for other syslog error messages on the same node. Save a copy of the /var/adm/messages files on all nodes, and report the problem to your authorized Sun service provider.


728881 Failed to read data: %s.

Description:

Failed to read the data from the socket. The reason might be expiration of timeout, hung application or heavy load.

Solution:

Check if the application is hung. If this is the case, restart the appilcation.


728928 CCR: Can't access table %s on node %s errno = %d.

Description:

The indicated error occurred when CCR was tried to access the indicated table on the nodes in the cluster. The errno value indicates the nature of the problem. errno values are defined in the file /usr/include/sys/errno.h. An errno value of 28(ENOSPC) indicates that the root files system on the node is full. Other values of errno can be returned when the root disk has failed(EIO).

Solution:

There may be other related messages on the node where the failure occurred. They may help diagnose the problem. If the root file system is full on the node, then free up some space by removing unnecessary files. If the root disk on the afflicted node has failed, then it needs to be replaced. If the indicated table was accidently removed, boot the indicated node in -x mode to restore the indicated table from backup. The CCR tables are located at /etc/cluster/ccr/.


729152 clexecd: Error %d from F_SETFD. Exiting.

Description:

clexecd program has encountered a failed fcntl(2) system call. The error message indicates the error number for the failure.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


730190 scvxvmlg error - found non device-node or non link %s, directory not removed

Description:

The program responsible for maintaining the VxVM namespace had detected suspicious entries in the global device namespace.

Solution:

The global device namespace should only contain diskgroup directories and volume device nodes for registered diskgroups. The specified path was not recognized as either of these and should be removed from the global device namespace.


730685 PCSTATUS: %s

Description:

The rpc.pmfd server was not able to monitor a process, and the system error is shown. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


730782 Failed to update scalable service: Error %d.

Description:

Update to a property related to scalability was not successfully applied to the system.

Solution:

Use scswitch to try to bring resource offline and online again on this node. If the error persists, reboot the node and contact your Sun service representative.


730956 %d entries found in property %s. For a nonsecure Netscape Directory Server instance %s should have exactly one entry.

Description:

Since a nonsecure Netscape Directory Server instance only listens on a single port, the list property should only have a single entry. A different number of entries was found.

Solution:

Change the number of entries to be exactly one.


731263 %s: run callback had a NULL event The run_callback() routine is called only when an IPMP group's state changes from OK to DOWN and also when an IPMP group is updated (adapter added to the group).

Solution:

Save a copy of the /var/adm/messages files on the node. Contact your authorized Sun service provider for assistance in diagnosing the problem.


731616 No memory.


732069 dl_attach: DL_ERROR_ACK protocol error

Description:

Could not attach to the physical device.

Solution:

Check the documentation for the driver associated with the private interconnect. It might be that the message returned is too small to be valid.


732569 reservation error(%s) error. Not found clexecd on node %d.

Description:

The device fencing code was unable to communicate with another cluster node.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available. Copies of /var/adm/messages from all nodes should be provided for diagnosis. It may be possible to retry the failed operation, depending on the nature of the error. If the message specifies the 'node_join' transition, then this node may be unable to access shared devices. If the failure occurred during the 'release_shared_scsi2' transition, then a node which was joining the cluster may be unable to access shared devices. In either case, it may be possible to reacquire access to shared devices by executing '/usr/cluster/lib/sc/run_reserve -c node_join' on all cluster nodes. If the failure occurred during the 'make_primary' transition, then a device group has failed to start on this node. If another node was available to host the device group, then it should have been started on that node. If desired, it may be possible to switch the device group to this node with the scswitch command. If no other node was available, then the device group will not have been started. The scswitch command may be used to retry the attempt to start the device group. If the failure occurred during the 'primary_to_secondary' transition, then the shutdown or switchover of a device group has failed. The desired action may be retried.


732643 scha_control: warning: cannot store %s restart timestamp for resource group <%s> resource <%s>: time() failed, errno <%d> (%s)

Description:

A time() system call has failed. This prevents updating the history of scha_control restart calls. This could cause the scha_resource_get (NUM_RESOURCE_RESTARTS) or (NUM_RG_RESTARTS) query to return an inaccurate value on this node. This in turn could cause a failing resource to be restarted continually rather than failing over to another node. However, this problem is very unlikely to occur.

Solution:

If this message is produced and it appears that a resource or resource group is continually restarting without failing over, try switching the resource group to another node. Other syslog error messages occurring on the same node might provide further clues to the root cause of the problem.


732787 bv1to1.conf.sh file is not found in the %s/etc directory

Description:

The bv1to1.conf .sh file is not accessible.

Solution:

Check if the file exists in $BV1TO1_VAR/etc/bv1to1.conf.sh. If the file exists in this directory check if the BV1TO1_VAR extension property is correctly set.


732822 clconf: Invalid group name

Description:

An invalid group name has been encountered while converting a group name to clconf_obj type. Valid group names are "cluster", "nodes", "adapters", "ports", "blackboxes", "cables", and "quorum_devices".

Solution:

This is an unrecoverable error, and the cluster needs to be rebooted. Also contact your authorized Sun service provider to determine whether a workaround or patch is available.


733367 lkcm_act: %s: %s cm_reconfigure failed

Description:

ucmm reconfiguration failed.

Solution:

None if the next reconfiguration succeeds. If not, save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.


734057 clcomm: Duplicate TypeId's: %s : %s

Description:

The system records type identifiers for multiple kinds of type data. The system checks for type identifiers when loading type information. This message identifies two items having the same type identifiers. This checking only occurs on debug systems.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


734173 No hosts are configured in bv1to1.conf.

Description:

There are no hosts configured to run Broadvision processes.

Solution:

Configure the BV hosts in bv1to1.conf file properly.


734832 clutil: Created insufficient threads in threadpool

Description:

There was insufficient memory to create the desired number of threads.

Solution:

Install more memory, increase swap space, or reduce peak memory consumption.


734890 pthread_detach: %s

Description:

The rpc.pmfd server was not able to detach a thread, possibly due to low memory. The message contains the system error. The server does not perform the action requested by the client, and an error message is output to syslog.

Solution:

Investigate if the machine is running out of memory. If all looks correct, save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


735336 Media error encountered, but Auto_end_bkp is disabled.

Description:

The HA-Oracle start method identified that one or more datafiles is in need of recovery. The Auto_end_bkp extension property is disabled so no further recovery action was taken.

Solution:

Examine the log files for the cause of the media error. If it's caused by datafiles being left in hot backup mode, the Auto_end_bkp extension property should be enabled or the datafiles should be recovered manually.


737104 Received unexpected result <%d> from rpc.fed, aborting node

Description:

This node encountered an unexpected error while communicating with other cluster nodes during a cluster reconfiguration. The ucmmd will produce a core file and will cause the node to halt or reboot.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


737125 INTERNAL ERROR: PENDING_OFF_STOP_FAILED or ERROR_STOP_FAILED in rebalance()

Description:

An internal error has occurred in the rgmd. This may prevent the rgmd from bringing affected resource groups online.

Solution:

Look for other syslog error messages on the same node. Save a copy of the /var/adm/messages files on all nodes, and report the problem to your authorized Sun service provider.


737445 rebalance: not attempting to start resource group <%s> on node <%s> because this resource group has already failed to start on this node %d or more times in the past %d seconds

Description:

The rgmd is preventing "ping-pong" failover of the resource group, i.e., repeated failover of the resource group between two or more nodes.

Solution:

The time interval in seconds that is mentioned in the message can be adjusted by using scrgadm(1M) to set the Pingpong_interval property of the resource group.


737572 PMF error when starting Sybase %s: %s. Error: %s

Description:

Sun Cluster HA for Sybase failed to start sybase server using Process Monitoring Facility (PMF). Other syslog messages and the log file will provide additional information on possible reasons for the failure.

Solution:

Please whether the server can be started manually. Examine the HA-Sybase log files, sybase log files and setup.


738465 Malformed adapter specification %s.

Description:

Failed to retrieve the ipmp information. The given adapter specification is invalid.

Solution:

Check whether the adapter specification is in the form of ipmpgroup@nodename. If not, recreate the resource with the properly formatted adapter information.


738847 clexecd: unable to create failfast object.

Description:

clexecd problem could not enable one of the mechanisms which causes the node to be shutdown to prevent data corruption, when clexecd program dies.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


739356 warning: cannot store start_failed timestamp for resource group <%s>: time() failed, errno <%d> (%s)

Description:

The specified resource group failed to come online on some node, but this node is unable to record that fact due to the failure of the time(2) system call. The consequence of this is that the resource group may continue to pingpong between nodes for longer than the Pingpong_interval property setting.

Solution:

Examine other syslog messages occurring around the same time on the same node, to see if the cause of the problem can be identified. If the same error recurs, you might have to reboot the affected node.


739653 Port number %d is listed twice in property %s, at entries %d and %d.

Description:

The port number in the message was listed twice in the named property, at the list entry locations given in the message. A port number should only appear once in the property.

Solution:

Specify the property with only one occurrence of the port number.


740373 Failed to get the scalable service related properties for resource %s.

Description:

An unexpected error occurred while trying to collect the properties related to scalable networking for the named resource.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


740972 in fe_set_env_vars setlocale failed

Description:

The rgmd server was not able to get the locale environment, while trying to connect to the rpc.fed server. An error message is output to syslog.

Solution:

Investigate if the host is running out of memory. If not save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


742337 Node %d is in the %s for resource %s, but property %s identifies resource %s which cannot host an address on node %d.

Description:

All IP addresses used by this resource must be configured to be available on all nodes that the scalable resource can run on.

Solution:

Either change the resource group nodelist to exclude the nodes that cannot host the SharedAddress IP address, or select a different network resource whose IP address will be available on all nodes where this scalable resource can run.


743362 could not read failfast mode, using panic

Description:

/opt/SUNWudlm/etc/udlm.conf did not have an entry for failfast mode. Default mode of 'panic' will be used.

Solution:

None.


743923 Starting server with command %s.

Description:

Sun Cluster is starting the application with the specified command.

Solution:

This is an informational message, no user action is needed.


744837 No executable $BV1TO1/bin/bvconf

Description:

The specified executable is not foundAction:Check if the Broadvision software was installed properly. Make sure the specified executable is available at theright location.


744866 Failed to check status of SUNW.HAStoragePlus resource.

Description:

An error occured while checking the status of the SUNW.HAStoragePlus resource that this resource depends on.

Solution:

Check syslog messages and correct the problems specified in prior syslog messages. If the error still persists, please report this problem.


745100 CMM: Quorum device %ld has been changed from %s to %s.

Description:

The name of the specified quorum device has been changed as indicated. This can happen if while this node was down, the previous quorum device was removed from the cluster, and a new one was added (and assigned the same id as the old one) to the cluster.

Solution:

This is an informational message, no user action is needed.


745275 PNM daemon system error: %s

Description:

A system error has occured in the PNM daemon. This could be because of the resources on the system being very low. eg: low memory.

Solution:

If the message is: out of memory - increase the swap space, install more memory or reduce peak memory consumption. Otherwise the error is unrecoverable, and the node needs to be rebooted. can't open file - check the "open" man page for possible error. fcntl error - check the "fcntl" man page for possible errors. poll failed - check the "poll" man page for possible errors. socket failed - check the "socket" man page for possible errors. SIOCGLIFNUM failed - check the "ioctl" man page for possible errors. SIOCGLIFCONF failed - check the "ioctl" man page for possible errors. wrong address family - check the "ioctl" man page for possible errors. SIOCGLIFFLAGS failed - check the "ioctl" man page for possible errors. SIOCGLIFADDR failed - check the "ioctl" man page for possible errors. rename failed - check the "rename" man page for possible errors. SIOCGLIFGROUPNAME failed - check the "ioctl" man page for possible errors. setsockopt (SO_REUSEADDR) failed - check the "setsockopt" man page for possible errors. bind failed - check the "bind" man page for possible errors. listen failed - check the "listen" man page for possible errors. read error - check the "read" man page for possible errors. SIOCSLIFGROUPNAME failed - check the "ioctl" man page for possible errors. SIOCSLIFFLAGS failed - check the "ioctl" man page for possible errors. SIOCGLIFNETMASK failed - check the "ioctl" man page for possible errors. SIOCSLIFADDR failed - check the "ioctl" man page for possible errors. SIOCLIFREMOVEIF failed - check the "ioctl" man page for possible errors. SIOCSLIFNETMASK failed - check the "ioctl" man page for possible errors. write error - check the "write" man page for possible errors. accept failed - check the "accept" man page for possible errors. wrong peerlen %d - check the "accept" man page for possible errors. gethostbyname failed %s - make sure entries in /etc/hosts, /etc/nsswitch.conf and /etc/netconfig are correct to get information about this host. SIOCGIFARP failed - check the "ioctl" man page for possible errors. Check the arp cache to see if all the adapters in the node have their entries. can't install SIGTERM handler - check the man page for possible errors.


747567 Unable to complete any share commands.

Description:

None of the paths specified in the dfstab.<resource-name> file were shared successfully.

Solution:

The prenet_start method would fail. Sun Cluster resource management would attempt to bring the resource on-line on some other node. Manually check that the paths specified in the dfstab.<resource-name> file are correct.


748729 clconf: Failed to open table infrastructure in unregister_infr_callback

Description:

Failed to open table infrastructure in unregistered clconf callback with CCR. Table infrastructure not found.

Solution:

Check the table infrastructure.


749409 clcomm: validate_policy: high not enough. high %d low %d in c %d nodes %d pool %d

Description:

The system checks the proposed flow control policy parameters at system startup and when processing a change request. For a variable size resource pool, the high server thread level must be large enough to allow all of the nodes identified in the message join the cluster and receive a minimal number of server threads.

Solution:

No user action required.


749958 CMM: Unable to create %s thread.

Description:

The CMM was unable to create its specified thread and the system can not continue. This is caused by inadequate memory on the system.

Solution:

Add more memory to the system. If that does not resolve the problem, contact your authorized Sun service provider to determine whether a workaround or patch is available.


751079 scha_cluster_open failed

Description:

Call to initialize a handle to get cluster information failed. This means that the incoming connection to the PNM daemon will not be accepted.

Solution:

There could be other related error messages which might be helpful. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


751258 TCPTR: Node %u attempting to join cluster has incompatible cluster software. \"%s\" on node %u not compatible with \"%s\" on node %u.

Description:

Tranport at the local node received an initial handshake message from the remote node that is not running a compatible version of the suncluster software.

Solution:

Make sure all nodes in the cluster are running compatible versions of sun cluster software.


751934 scswitch: rgm_change_mastery() failed with NOREF, UNKNOWN, or invalid error on node %d

Description:

An inter-node communication failed with an unknown exception while the rgmd was attempting to execute an operator-requested switch of the primaries of a resource group, or was attempting to "fail back" a resource group onto a node that just rejoined the cluster. This will cause the attempted switching action to fail.

Solution:

Examine other syslog messages occurring around the same time on the same node, to see if the cause of the problem can be identified. If the switch was operator-requested, retry it. If the same error recurs, you might have to reboot the affected node. Since this problem might indicate an internal logic error in the clustering software, please save a copy of the /var/adm/messages files on all nodes, the output of an scstat -g command, and the output of a scrgadm -pvv command. Report the problem to your authorized Sun service provider.


752212 Failed to retrieve the resource handle: %s


753155 Starting fault monitor. pmf tag %s.

Description:

The fault monitor is being started under control of the Process Monitoring Facility (PMF), with the tag indicated in the message.

Solution:

This is an information message, no user action is needed.


754283 pipe: %s

Description:

The rpc.fed server was not able to create a pipe. The message contains the system error. The server will not capture the output from methods it runs.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


754521 Property %s does not have a value. This property must have exactly one value.

Description:

The property named does not have a value specified for it.

Solution:

Set the property to have exactly one value.


754848 The property %s must contain at least one SharedAddress network resource.

Description:

The named property must contain at least one SharedAddress.

Solution:

Specify a SharedAddress resource for this property.


756082 clcomm:Cannot fork() after ORB server initialization.

Description:

A user level process attempted to fork after ORB server initialization. This is not allowed.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


756650 Failed to set the global interface node to %d for IP %s: %s.

Description:

A call to the underlying scalable networking code failed.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


757236 Error initializing LDAP library to probe %s port %d for non-secure resource %s: %s

Description:

An error occurred while initializing the LDAP library. The error message will contain the error returned by the library.

Solution:

Investigate if the machine is running out of memory. If this is not the case, save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


757581 Failed to stop daemon %s.

Description:

The HA-NFS implementation was unable to stop the specified daemon.

Solution:

The resource could be in a STOP_FAILED state. If the failover mode is set to HARD, the node would get automatically rebooted by the SunCluster resource management. If the Failover_mode is set to SOFT or NONE, please check that the specified daemon is indeed stopped (by killing it by hand, if necessary). Then clear the STOP_FAILED status on the resource and bring it on-line again using the scswitch command.


757758 scvxvmlg error - getminor called with a bad filename: %s

Description:

The program responsible for maintaining the VxVM namespace has suffered an internal error. If configuration changes were recently made to VxVM diskgroups or volumes, this node may be unaware of those changes. Recently created volumes may be unaccessible from this node.

Solution:

If no configuration changes have been recently made to VxVM diskgroups or volumes and all volumes continue to be accessible from this node, then no action is required. If changes have been made, the device namespace on this node can be updated to reflect those changes by executing '/usr/cluster/lib/dcs/scvxvmlg'. If the problem persists, contact your authorized Sun service provider to determine whether a workaround or patch is available.


760001 (%s) netconf error: cannot get transport info for 'ticlts' %s

Description:

Call to getnetconfigent failed and udlmctl could not get network information. udlmctl will exit.

Solution:

Make sure the interconnect does not have any problems. Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.


760354 modinstall of cldlpihb failed

Description:

The streams module that intercepts heartbeat messages could not be installed.


761076 dl_bind: DL_ERROR_ACK protocol error

Description:

Could not bind to the physical device. We are trying to open a fast path to the private transport adapters.

Solution:

Reboot of the node might fix the problem.


762902 Failed to restart fault monitor.

Description:

The resource property that was updated needed the fault monitor to be restarted in order for the change to take effect, but the attempt to restart the fault monitor failed.

Solution:

Look at the prior syslog messages for specific problems. Correct the errors if possible. Look for the process <dataservice>_probe operating on the desired resource (indicated by the argument to "-R" option). This can be found from the command: ps -ef | egrep <dataservice>_probe | grep "\-R <resourcename>" Send a kill signal to this process. If the process does not get killed and restarted by the process monitor facility, reboot the node.


763305 realloc: %d

Description:

Solution:


763570 can't start pnmd due to lock

Description:

An attempt was made to start multiple instances of the PNM daemon pnmd(1M), or pnmd(1M) has problem acquiring a lock on the file (/var/cluster/run/pnm_lock).

Solution:

Check if another instance of pnmd is already running. If not, remove the lock file (/var/cluster/run/pnm_lock) and start pnmd by sending KILL (9) signal to pnmd. PMF will restart pnmd automatically.


763781 For global service <%s> of path <%s>, local node is less preferred than node <%d>. But affinity switch over may still be done.

Description:

A service is switched to a less preferred node due to affinity switchover of SUNW.HAStorage prenet_start method.

Solution:

Check which configuration can gain more performance benefit, either to leave the service on its most preferred node or let the affinity switchover take effect. Using scswitch(1m) to switch it back if necessary.


763929 HA: rm_service_thread_create failed

Description:

The system could not create the needed thread, because there is inadequate memory.

Solution:

There are two possible solutions. Install more memory. Alternatively, reduce memory usage.


765087 uname: %s

Description:

The rpc.fed server encountered an error with the uname function. The message contains the system error.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


765395 clcomm: RT class not configured in this system

Description:

Sun Cluster requires that the real time thread scheduling class be configured in the kernel.

Solution:

Configure Solaris with the RT thread scheduling class in the kernel.


766093 IP address (hostname) and Port pairs %s%c%d%c%s and %s%c%d%c%s in property %s, at entries %d and %d, effectively duplicate each other. The port numbers are the same and the resolved IP addresses are the same.

Description:

The two list entries at the named locations in the named property have port numbers that are identical, and also have IP address (hostname) strings that resolve to the same underlying IP address. An IP address (hostname) string and port entry should only appear once in the property.

Solution:

Specify the property with only one occurrence of the IP address (hostname) string and port entry.


766316 Started saposcol process under PMF successfully.

Description:

The SAP OS collector process is started successfully under the control of the Process monitor facility.

Solution:

Informational message. No user action needed.


767363 CMM: Disconnected from node %ld; aborting using %s rule.

Description:

Due to a connection failure between the local and the specified node, the local node must be halted to avoid a "split brain" configuration. The CMM used the specified rule to decide which node to fail. Rules are: rebootee: If one node is rebooting and the other was a member of the cluster, the node that is rebooting must abort. quorum: The node with greater control of quorum device votes survives and the other node aborts. node number: The node with higher node number aborts.

Solution:

The cause of the failure should be resolved and the node should be rebooted if node failure is unexpected.


767488 reservation fatal error(UNKNOWN) - Command not specified

Description:

The device fencing program has suffered an internal error.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available. Copies of /var/adm/messages from all nodes should be provided for diagnosis. It may be possible to retry the failed operation, depending on the nature of the error. If the message specifies the 'node_join' transition, then this node may be unable to access shared devices. If the failure occurred during the 'release_shared_scsi2' transition, then a node which was joining the cluster may be unable to access shared devices. In either case, it may be possible to reacquire access to shared devices by executing '/usr/cluster/lib/sc/run_reserve -c node_join' on all cluster nodes. If the failure occurred during the 'make_primary' transition, then a device group has failed to start on this node. If another node was available to host the device group, then it should have been started on that node. If desired, it may be possible to switch the device group to this node with the scswitch command. If no other node was available, then the device group will not have been started. The scswitch command may be used to retry the attempt to start the device group. If the failure occurred during the 'primary_to_secondary' transition, then the shutdown or switchover of a device group has failed. The desired action may be retried.


767629 lkcm_reg: Unix DLM version (%d) and the OSD library version (%d) are not compatible. Unix DLM versions acceptable to this library are: %d

Description:

Unix DLM and Oracle DLM are not compatibale. Compatible versions will be printed as part of this message.

Solution:

Check installation procedure to make sure you have the correct versions of Oracle DLM and Unix DLM. Contact Sun service representative if versions cannot be resolved.


767858 in libsecurity unknown security type %d

Description:

This is an internal error which shouldn't occur. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


768676 Failed to access <%s>: <%s>

Description:

Solution:


770355 fatal: received signal %d

Description:

The daemon indicated in the message tag (rgmd or ucmmd) has received a SIGTERM signal, possibly caused by an operator-initiated kill(1) command. The daemon will produce a core file and will force the node to halt or reboot to avoid the possibility of data corruption.

Solution:

The operator must use scswitch(1M) and shutdown(1M) to take down a node, rather than directly killing the daemon.


770675 monitor_check: fe_method_full_name() failed for resource <%s>, resource group <%s>

Description:

During execution of a scha_control(1HA,3HA) function, the rgmd was unable to assemble the full method pathname for the MONITOR_CHECK method. This is considered a MONITOR_CHECK method failure. This in turn will prevent the attempted failover of the resource group from its current master to a new master.

Solution:

Examine other syslog messages occurring at about the same time to see if the problem can be identified. Save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance in diagnosing and correcting the problem.


770776 INTERNAL ERROR: process_resource: Resource <%s> is R_BOOTING in PENDING_ONLINE resource group

Description:

The rgmd is attempting to bring a resource group online on a node where BOOT methods are still being run on its resources. This should not occur and may indicate an internal logic error in the rgmd.

Solution:

Look for other syslog error messages on the same node. Save a copy of the /var/adm/messages files on all nodes, and report the problem to your authorized Sun service provider.


771340 fatal: Resource group <%s> update failed with error <%d>; aborting node

Description:

Rgmd failed to read updated resource group from the CCR on this node.

Solution:

Save a copy of the /var/adm/messages files on all nodes, and of the rgmd core file. Contact your authorized Sun service provider for assistance in diagnosing the problem.


772294 %s requests reconfiguration in step %s

Description:

Return status at the end of a step execution indicates that a reconfiguration is required.

Solution:

None.


772395 shutdown immediate did not succeed. (%s)

Description:

Failed to shutdown Oracle server using 'shutdown immediate' command.

Solution:

Examine 'Stop_timeout' property of the resource and increase 'Stop_timeout' if Oracle server takes long time to shutdown. and if you don't wish to use 'shutdown abort' for stopping Oracle server.


773078 Error in configuration file lookup (%s, ...): %s

Description:

Could not read configuration file udlm.conf.

Solution:

Make sure udlm.conf exists under /opt/SUNWudlm/etc and has the correct permissions.


773226 Server_url %s probe failed

Description:

The probing of the url set in the Server_url extension property failed. The agent probe will take action.

Solution:

None. The agent probe will take action. However, the cause of the failure should be investigated further. Examine the log file and syslog messages for additional information.


773366 thread create for hb_threadpool failed

Description:

The system was unable to create thread used for heartbeat processing.

Solution:

Take steps to increase memory availability. The installation of more memory will avoid the problem with a kernel inability to create threads. For a user level process problem: install more memory, increase swap space, or reduce the peak work load.


774752 reservation error(%s) - do_scsi3_inresv() error for disk %s

Description:

The device fencing program has encountered errors while trying to access a device. All retry attempts have failed.

Solution:

For the user action required by this message, see the user action for message 192619.


775696 Unable to unlock file: %s.


776199 (%s) reconfigure: cm error %s

Description:

ucmm reconfiguration failed.

Solution:

None if the next reconfiguration succeeds. If not, save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.


776339 INTERNAL ERROR: postpone_stop_r: meth type <%d>

Description:

A non-fatal internal error has occurred in the rgmd state machine.

Solution:

Since this problem might indicate an internal logic error in the rgmd, please save a copy of the /var/adm/messages files on all nodes, the output of an scstat -g command, and the output of a scrgadm -pvv command. Report the problem to your authorized Sun service provider.


778629 ERROR: MONITOR_STOP method is not registered for ONLINE resource <%s>

Description:

A non-fatal internal error has occurred in the rgmd state machine.

Solution:

Since this problem might indicate an internal logic error in the rgmd, please save a copy of the /var/adm/messages files on all nodes, the output of an scstat -g command, and the output of a scrgadm -pvv command. Report the problem to your authorized Sun service provider.


779073 in fe_set_env_vars malloc of env_name[%d] failed

Description:

The rgmd server was not able to allocate memory for an environment variable, while trying to connect to the rpc.fed server, possibly due to low memory. An error message is output to syslog.

Solution:

Investigate if the host is running out of memory. If not save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


779089 Could not start up DCS client because we could not contact the name server.

Description:

There was a fatal error while this node was booting.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


779544 "pmfctl -R": Error resuming pid %d for tag <%s>: %d

Description:

An error occured while rpc.pmfd attempted to resume the monitoring of the indicated pid, possibly because the indicated pid has exited while attempting to resume its monitoring.

Solution:

Check if the indicated pid has exited, if this is not the case, Save the syslog messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


780283 clcomm: Exception in coalescing region - Lost data

Description:

While supporting an invocation, the system wanted to combine buffers and failed. The system identifies the exception prior to this message.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


780539 Stopping fault monitor: %s:%ld:%s

Description:

Fault monitor has detected an error. Fault monitor will be stopped. Error detected by fault monitor and action taken by fault monitor is indicated in message.

Solution:

None


780792 Failed to retrieve the resource type information.

Description:

A Sun cluster data service has failed to retrieve the resource type's property information. Low memory or API call failure might be the reasons.

Solution:

In case of low memory, the problem will probably cured by rebooting. If the problem reoccurs, you might need to increase swap space by configuring additional swap devices. Otherwise, if it is API call failure, check the syslog messages from other components.


781445 kill -0: %s

Description:

The rpc.fed server is not able to send a signal to a tag that timed out, and the error message is shown. An error message is output to syslog.

Solution:

Save the syslog messages file. Examine other syslog messages occurring around the same time on the same node, to see if the cause of the problem can be identified.


781731 Failed to retrieve the cluster handle: %s.

Description:

An API operation has failed while retrieving the cluster information.

Solution:

This may be solved by rebooting the node. For more details about API failure, check the messages from other components.


782111 This list element in System property %s is missing a protocol: %s.

Description:

The system property that was named does not have a valid format. The value of the property must include a protocol.

Solution:

Add a protocol to the property value.


782694 The value returned for property %s for resource %s was invalid.

Description:

An unexpected value was returned for the named property.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


783130 Failed to retrieve the node id for node %s: %s.

Description:

API operation has failed while retrieving the node id for the given node.

Solution:

Check whether the node name is valid. For more information about API call failure, check the messages from other components.


783581 scvxvmlg fatal error - clconf_lib_init failed, returned %d

Description:

The program responsible for maintaining the VxVM namespace has suffered an internal error. If configuration changes were recently made to VxVM diskgroups or volumes, this node may be unaware of those changes. Recently created volumes may be unaccessible from this node.

Solution:

If no configuration changes have been recently made to VxVM diskgroups or volumes and all volumes continue to be accessible from this node, then no action is required. If changes have been made, the device namespace on this node can be updated to reflect those changes by executing '/usr/cluster/lib/dcs/scvxvmlg'. If the problem persists, contact your authorized Sun service provider to determine whether a workaround or patch is available.


784311 Network_resources_used property not set properly

Description:

There are probably more than 1 logical IP addresses in thisresource group and the Network_resources_used property is not properly set to associate the Resources to the appropriatebackend hosts.

Solution:

Set the Network_resources_used property for each resource in the RG to the logical IP address in the RG that is actually configured to run BV backend processes.


784560 resource %s status on node %s change to %s

Description:

This is a notification from the rgmd that a resource's fault monitor status has changed.

Solution:

This is an informational message, no user action is needed.


784607 Couldn't fork1.

Description:

The fork(1) system call failed.

Solution:

Some system resource has been exceeded. Install more memory, increase swap space or reduce peak memory consumption.


785003 clexecd: priocntl to set ts returned %d. Exiting.

Description:

clexecd program has encountered a failed priocltl(2) system call. The error message indicates the error number for the failure.

Solution:

clexecd program will exit and node will be halted or rebooted to prevent data corruption. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


785101 transition '%s' failed for cluster '%s': unknown code %d

Description:

The mentioned state transition failed for the cluster because of an unexpected command line option. udlmctl will exit.

Solution:

This is an internal error. Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.


785154 Could not look up IP because IP was NULL.

Description:

The mapping for the given ip address in the local host files can't be done: the specified ip address is NULL.

Solution:

Check whether the ip address has NULL value. If this is the case, recreate the resource with valid host name. If this is not the reason, treat it as an internal error and contact Sun service provider.


785213 reservation error(%s) - IOCDID_ISFIBRE failed for device %s, errno %d

Description:

The device fencing program has suffered an internal error.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available. Copies of /var/adm/messages from all nodes should be provided for diagnosis. It may be possible to retry the failed operation, depending on the nature of the error. If the message specifies the 'node_join' transition, then this node may be unable to access shared devices. If the failure occurred during the 'release_shared_scsi2' transition, then a node which was joining the cluster may be unable to access shared devices. In either case, it may be possible to reacquire access to shared devices by executing '/usr/cluster/lib/sc/run_reserve -c node_join' on all cluster nodes. If the failure occurred during the 'make_primary' transition, then a device group has failed to start on this node. If another node was available to host the device group, then it should have been started on that node. If desired, it may be possible to switch the device group to this node with the scswitch command. If no other node was available, then the device group will not have been started. The scswitch command may be used to retry the attempt to start the device group. If the failure occurred during the 'primary_to_secondary' transition, then the shutdown or switchover of a device group has failed. The desired action may be retried.


786114 Cannot access file: %s (%s)

Description:

Unable to access the file because of the indicated reason.

Solution:

Check that the file exists and has the correct permissions.


786412 reservation fatal error(UNKNOWN) - clconf_lib_init() error, returned %d

Description:

The device fencing program has suffered an internal error.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available. Copies of /var/adm/messages from all nodes should be provided for diagnosis. It may be possible to retry the failed operation, depending on the nature of the error. If the message specifies the 'node_join' transition, then this node may be unable to access shared devices. If the failure occurred during the 'release_shared_scsi2' transition, then a node which was joining the cluster may be unable to access shared devices. In either case, it may be possible to reacquire access to shared devices by executing '/usr/cluster/lib/sc/run_reserve -c node_join' on all cluster nodes. If the failure occurred during the 'make_primary' transition, then a device group has failed to start on this node. If another node was available to host the device group, then it should have been started on that node. If desired, it may be possible to switch the device group to this node with the scswitch command. If no other node was available, then the device group will not have been started. The scswitch command may be used to retry the attempt to start the device group. If the failure occurred during the 'primary_to_secondary' transition, then the shutdown or switchover of a device group has failed. The desired action may be retried.


786765 Failed to get host names from resource %s.

Description:

The networking information for the resource could not be retrieved.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


787063 Error in getting parameters for global service <%s> of path <%s>: %s

Description:

Can not get information of global service.

Solution:

Save a copy of /var/adm/messages and contact your authorized Sun service provider to determine what is the cause of the problem.


787529 NAFO group %s has status %s. The status of the NAFO group will be checked again in %d seconds.

Description:

The IPMP group named is in a transition state. The status will be checked again.

Solution:

This is an informational message, no user action is needed.


787616 Adapter %s is not a valid IPMP group on this node.

Description:

Validation of the adapter information has failed. The specified IPMP group does not exist on this node.

Solution:

Create appropriate IPMP group on this node or recreate the logical host with correct IPMP group.


789135 The Data base probe %s failed.The WLS probe will wait for the DB to be UP before starting the WLS

Description:

The Data base probe (set in the extension property db_probe_script) failed. The start method will not start the WLS. The probe method will wait till the DB probe succeeds before starting the WLS.

Solution:

Make sure the DB probe (set in db_probe_script) succeeds. Once the DB is started the WLS probe will start the WLS instance.


788145 gethostbyname() failed: %s.

Description:

gethostbyname() failed with unexpected error.

Solution:

Check if name service is configured correctly. Try some commands to query name serves, such as ping and nslookup, and correct the problem. If the error still persists, then reboot the node.


789223 lkcm_sync: caller is not registered

Description:

udlm is not registered with ucmm.

Solution:

This is an internal error. Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.


789460 monitor_check: call to rpc.fed failed for resource <%s>, resource group <%s>, method <%s>

Description:

A scha_control(1HA,3HA) GIVEOVER attempt failed, due to a failure of the rgmd to communicate with the rpc.fed daemon. If the rpc.fed process died, this might lead to a subsequent reboot of the node. Otherwise, this will prevent a resource group on the local node from failing over to an alternate primary node

Solution:

Examine other syslog messages occurring at about the same time to see if the problem can be identified and if it recurs. Save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance.


790758 Unable to open /dev/null: %s

Description:

While starting up, one of the rgmd daemons was not able to open /dev/null. The message contains the system error. This will prevent the daemon from starting on this node.

Solution:

Examine other syslog messages occurring at about the same time to see if the problem can be identified. Save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance in diagnosing and correcting the problem.


791495 Unregistered syscall (%d)

Description:

An internal error has occured. This should not happen. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


791577 Waiting for the host $i to startup.

Description:

Waiting for the specified host to startup.

Solution:

Bring the resource group containing the specified host online if it isnot yet running. If the resource group is already onlinethe probe will take appropriate action.


791959 Error: reg_evt missing correct names

Description:

Solution:


792109 Unable to set number of file descriptors.

Description:

rpc.pmfd was unable to set the number of file descriptors used in the RPC server.

Solution:

Look for other syslog error messages on the same node. Save a copy of the /var/adm/messages files on all nodes, and report the problem to your authorized Sun service provider.


792295 Some shared paths in file %s are invalid.


792338 The property %s must contain at least one value.

Description:

The named property does not have a legal value.

Solution:

Assign the property a value.


792683 clexecd: priocntl to set rt returned %d. Exiting.

Description:

clexecd program has encountered a failed priocltl(2) system call. The error message indicates the error number for the failure.

Solution:

clexecd program will exit and node will be halted or rebooted to prevent data corruption. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


792967 Unable to parse configuration file.

Description:

While parsing the Netscape configuration file an error occured in while either reading the file, or one of the fields within the file.

Solution:

Make sure that the appropriate configuration file is located in its default location with respect to the Confdir_list property.


793575 Adaptive server terminated.

Description:

Graceful shutdown did not succeed. Adaptive server processes were killed in STOP method.

Solution:

Please check the permissions of file specified in the STOP_FILE extension property. File should be executable by the Sybase owner and root user.


793651 Failed to parse xml for %s: %s

Description:

Solution:


794220 switchback: bad nodename <%s>

Description:

The rgmd encountered a bad node name in the Nodelist of a resource group it was trying to failback. This might indicate corruption of CCR data or rgmd in-memory state.

Solution:

Use scrgadm(1M) -pvv to examine resource group properties. If the values appear corrupted, the CCR might have to be rebuilt. If values appear correct, this may indicate an internal error in the rgmd. Please contact your authorized Sun service provider for assistance in diagnosing and correcting the problem.


794535 clcomm: Marshal Type mismatch. Expecting type %d got type %d

Description:

When MARSHAL_DEBUG is enabled, the system tags every data item marshalled to support an invocation. This reports that the current data item in the received message does not have the expected type. The received message format is wrong.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


795047 Stop fault monitor using pmfadm failed. tag %s error=%d

Description:

Failed to stop fault monitor will be stopped using Process Monitoring Facility (PMF), with the tag indicated in message. Error returned by PMF is indicated in message.

Solution:

Stop fault monitor processes. Please report this problem.


795062 Stop fault monitor using pmfadm failed. tag %s error=%s

Description:

Failed to stop fault monitor will be stopped using Process Monitoring Facility (PMF), with the tag indicated in message. Error returned by PMF is indicated in message.

Solution:

Stop fault monitor processes. Please report this problem.


795381 t_open: %s

Description:

Call to t_open() failed. The "t_open" man page describes possible error codes. udlm exits and the node will abort and panic.

Solution:

This is an internal error. Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.


795754 scha_control: resource <%s> restart request is rejected because the resource type <%s> must have START and STOP methods

Description:

A resource monitor (or some other program) is attempting to restart the indicated resource by calling scha_control(1ha),(3ha). This request is rejected because the resource type fails to declare both a START method and a STOP method. This represents a bug in the calling program, because the resource_restart feature can only be applied to resources that have STOP and START methods. Instead of attempting to restart the individual resource, the programmer may use scha_control(RESTART) to restart the resource group.

Solution:

The resource group may be restarted manually on the same node or switched to another node by using scswitch(1m) or the equivalent GUI command. Contact the author of the data service (or of whatever program is attempting to call scha_control) and report the error.


796536 Password file %s is not readable: %s

Description:

For the secure server to run, a password file named keypass is required. This file could not be read, which resulted in an error when trying to start the Data Service.

Solution:

Create the keypass file and place it under the Confdir_list path for this resource. Make sure that the file is readable.


796771 check_for_ccrdata failed malloc of size %d

Description:

Call to malloc failed. The "malloc" man page describes possible reasons.

Solution:

Install more memory, increase swap space or reduce peak memory consumption.


797604 CMM: Connectivity of quorum device %ld (%s) has been changed from 0x%llx to 0x%llx.

Description:

The number of configured paths to the specified quorum device has been changed as indicated. The connectivity information is depicted as bitmasks.

Solution:

This is an informational message, no user action is needed.


798060 Error opening procfs status file <%s> for tag <%s>: %s

Description:

The rpc.pmfd server was not able to open a procfs status file, and the system error is shown. procfs status files are required in order to monitor user processes.

Solution:

Investigate if the machine is running out of memory. If this is not the case, save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


798175 sema_wait: %s

Description:

The rpc.pmfd server was not able to act on a semaphore. The message contains the system error. The server does not perform the action requested by the client, and an error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


798318 Could not verify status of %s.

Description:

A critical method was unable to determine the status of the specified service or resource.

Solution:

Please examine other messages in the /var/adm/messages file to determine the cause of this problem. Also verify if the specified service or resource is available or not. If not available, start the service or resource and retry the operation which failed.


798514 Starting fault monitor. pmf tag %s

Description:

Informational message. Fault monitor is being started under control of Process Monitoring Facility (PMF), with the tag indicated in message.

Solution:

None


798658 Failed to get the resource type name: %s.

Description:

While retrieving the resource information, API operation has failed to retrieve the resource type name.

Solution:

This is internal error. Contact your authorized Sun service provider. For more error description, check the syslog messages.


799348 INTERNAL ERROR: MONITOR_START method is not registered for resource <%s>

Description:

A non-fatal internal error has occurred in the rgmd state machine.

Solution:

Since this problem might indicate an internal logic error in the rgmd, please save a copy of the /var/adm/messages files on all nodes, the output of an scstat -g command, and the output of a scrgadm -pvv command. Report the problem to your authorized Sun service provider.


799426 clcomm: can't ifkconfig private interface: %s:%d cmd %d error %d

Description:

The system failed to configure private network device for IP communications across the private interconnect of this device and IP address, resulting in the error identified in the message.

Solution:

Ensure that the network interconnect device is supported. Otherwise, Contact your authorized Sun service provider to determine whether a workaround or patch is available.


799817 Failed to stop the application using SIGTERM. Will try to stop using SIGKILL

Description:

The Application could not be stopped by sending SIGTERM. The STOP method will try to stop the application by sending SIGKILL with infinite timeout.

Solution:

None.

Message IDs 800000–899999


800040 Error deleting PidFile <%s> (%s) for Apache service with apachectl file <%s>.

Description:

The data service was not able to delete the specified PidFile file.

Solution:

Delete the PidFile file manually and start the resource group.


800320:Fencing %s from shared disk devices.

Description:

A reservation has been performed to fence off nonmember nodes from disks that are shared between the cluster nodes.

Solution:

None.


801519 connect: %s

Description:

Solution:


802295 monitor_check: resource group <%s> changed while running MONITOR_CHECK methods

Description:

An internal error has occurred in the locking logic of the rgmd, such that a resource group was erroneously allowed to be edited while a failover was pending on it, causing the scha_control call to return early with an error. This in turn will prevent the attempted failover of the resource group from its current master to a new master. This should not occur and may indicate an internal logic error in the rgmd.

Solution:

Examine other syslog messages occurring at about the same time to see if the problem can be identified. Save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance in diagnosing and correcting the problem.


802539 No permission for owner to read %s.

Description:

The owner of the file does not have read permission on it.

Solution:

Set the permissions on the file so the owner can read it.


803339 Prog <%s> step <%s>: program file is not executable.


803391 Could not validate the settings in %s. It is recommended that the settings for host lookup consult `files` before a name server.

Description:

Validation callback method has failed to validate the hostname list. There may be syntax error in the nsswitch.conf file.

Solution:

Check for the following syntax rules in the nsswitch.conf file. 1) Check if the lookup order for "hosts" has "files". 2) "cluster" is the only entry that can come before "files". 3) Everything in between '[' and ']' is ignored. 4) It is illegal to have any leading whitespace character at the beginning of the line; these lines are skipped. Correct the syntax in the nsswitch.conf file and try again.


803570 lkcm_parm: invalid handle was passed %s %d

Description:

Solution:


803649 Failed to check whether the resource is a logical host resource.

Description:

While retrieving the IP addresses from the network resources in the resource group, the attempt to check whether the resource is a logical host resource or not has failed.

Solution:

Internal error or API call failure might be the reasons. Check the error messages that occurred just before this message. If there is internal error, contact your authorized Sun service provider. For API call failure, check the syslog messages from other components. For the resource name and resource group name, check the syslog tag.


803719 host %s failed, and clnt_spcreateerror returned NULL

Description:

The rgm is not able to establish an rpc connection to the rpc.fed server on the host shown, and the rpc error could not be read. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


804457 Error reading properties; using old properties

Description:

Solution:


804658 clexecd: close returned %d while exec'ing (%s). Exiting.

Description:

clexecd program has encountered a failed close(2) system call. The error message indicates the error number for the failure.

Solution:

The clexecd program will exit and the node will be halted or rebooted to prevent data corruption. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


804791 A warm restart of rpcbind may be in progress.

Description:

The HA-NFS probe detected that the rpcbind daemon is not running, however it also detected that a warm restart of rpcbind is in progress.

Solution:

If a warm restart is indeed in progress, ignore this message. Otherwise, check to see if the rpcbind daemon is running. If not, reboot the node. If the rpcbind process is not running, the HA-NFS probe would reboot the node itself if the Failover_mode on the resource is set to HARD.


804820 clcomm: path_manager failed to create RT lwp (%d)

Description:

The system failed to create a real time thread to support path manager heart beats.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


805735 Failed to connect to the host <%s> and port <%d>.

Description:

An error occurred the while fault monitor attempted to make a connection to the specified hostname and port.

Solution:

Wait for the fault monitor to correct this by doing restart or failover. For more error descriptions, look at the syslog messages.


805788 reservation fatal error(%s) - service_name not specified

Description:

The device fencing program has suffered an internal error.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available. Copies of /var/adm/messages from all nodes should be provided for diagnosis. It may be possible to retry the failed operation, depending on the nature of the error. If the message specifies the 'node_join' transition, then this node may be unable to access shared devices. If the failure occurred during the 'release_shared_scsi2' transition, then a node which was joining the cluster may be unable to access shared devices. In either case, it may be possible to reacquire access to shared devices by executing '/usr/cluster/lib/sc/run_reserve -c node_join' on all cluster nodes. If the failure occurred during the 'make_primary' transition, then a device group has failed to start on this node. If another node was available to host the device group, then it should have been started on that node. If desired, it may be possible to switch the device group to this node with the scswitch command. If no other node was available, then the device group will not have been started. The scswitch command may be used to retry the attempt to start the device group. If the failure occurred during the 'primary_to_secondary' transition, then the shutdown or switchover of a device group has failed. The desired action may be retried.


806365 monitor_check: getlocalhostname() failed for resource <%s>, resource group <%s>

Description:

While attempting to process a scha_control(1HA,3HA) call, the rgmd failed in an attempt to obtain the hostname of the local node. This is considered a MONITOR_CHECK method failure. This in turn will prevent the attempted failover of the resource group from its current master to a new master.

Solution:

Examine other syslog messages occurring at about the same time to see if the problem can be identified. Save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance in diagnosing and correcting the problem.


806618 Resource group name is null.

Description:

This is an internal error. While attempting to retrieve the resource information, null value was retrieved for the resource group name.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


806902 clutil: Could not create lwp during respawn

Description:

There was insufficient memory to support this operation.

Solution:

Install more memory, increase swap space, or reduce peak memory consumption.


807015 Validation of URI %s failed

Description:

The validation of the uri entered in the monitor_uri_list failed.

Solution:

Make sure a proper uri is entered. Check the syslog and /var/adm/messages for the exact error. Fix it and set the monitor_uri_list extension property again.


807249 CMM: Node %s (nodeid = %d) with votecount = %d removed.

Description:

The specified node with the specified votecount has been removed from the cluster.

Solution:

This is an informational message, no user action is needed.


808444 lkcm_reg: Unix DLM version (?) and the OSD library version (%d) are not compatible. Unix DLM versions acceptable to this library are: %d

Description:

UNIX DLM and Oracle DLM are not compatible. Compatible versions will be printed as part of this message.

Solution:

Check installation procedure to make sure you have the correct versions of Oracle DLM and Unix DLM. Contact Sun service representative if versions cannot be resolved.


808746 Node id %d is higher than the maximum node id of %d in the cluster.

Description:

In one of the scalable networking properties, a node id was encountered that was higher than expected.

Solution:

Verify that the nodes listed in the scalable networking properties are still valid cluster members.


809322 Couldn't create deleted directory: error (%d)

Description:

The file system is unable to create temporary copies of deleted files.

Solution:

Mount the affected file system as a local file system, and ensure that there is no file system entry with name "._" at the root level of that file system. Alternatively, run fsck on the device to ensure that the file system is not corrupt.


809329 No adapter for node %s.

Description:

No IPMP group has been specified for this node.

Solution:

If this error message has occured during resource creation, supply valid adapter information and retry it. If this message has occured after resource creation, remove the LogicalHostname resource and recreate it with the correct IPMP group for each node which is a potential master of the resource group.


809554 Unable to access directory %s:%s.

Description:

A HA-NFS method attempted to access the specified directory but was unable to do so. The reason for the failure is also logged.

Solution:

If the directory is on a mounted filesystem, make sure the filesystem is currently mounted. If the pathname of the directory is not what you expected, check to see if the Pathprefix property of the resource group is set correctly. If this error occurs in any method other then VALIDATE, HA-NFS would attempt to recover the situation by either failing over to another node or (in case of Stop and Postnet_stop) by rebooting the node.


809858 ERROR: method <%s> timeout for resource <%s> is not an integer

Description:

The indicated resource method timeout, as stored in the CCR, is not an integer value. This might indicate corruption of CCR data or rgmd in-memory state. The method invocation will fail; depending on which method was being invoked and the Failover_mode setting on the resource, this might cause the resource group to fail over or move to an error state.

Solution:

Use scstat(1M) -g and scrgadm(1M) -pvv to examine resource properties. If the values appear corrupted, the CCR might have to be rebuilt. If values appear correct, this may indicate an internal error in the rgmd. Contact your authorized Sun service provider for assistance in diagnosing and correcting the problem.


809956 PCSEXIT: %s

Description:

The rpc.pmfd server was not able to monitor a process, and the system error is shown. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


809985 Elements in Confdir_list and Port_list must be 1-1 mapping

Description:

The Confdir_list and Port_list properties must contain the same number of entries, thus maintaining a 1-1 mapping between the two.

Solution:

Using the appropriate scrgadm command, configure this resource to contain the same number of entries in the Confdir_list and the Port_list properties.


810318 Unable to get the resource group handle: %s


810551 fatal: Unable to bind president to nameserver

Description:

The low-level cluster machinery has encountered a fatal error. The rgmd will produce a core file and will cause the node to halt or reboot to avoid the possibility of data corruption.

Solution:

Save a copy of the /var/adm/messages files on all nodes, and of the rgmd core file. Contact your authorized Sun service provider for assistance in diagnosing the problem.


811254 VALIDATE failed on resource <%s>, resource group <%s>

Description:

The resource's VALIDATE method exited with a non-zero exit code. This indicates that an attempted update of a resource or resource group is invalid.

Solution:

Examine syslog messages occurring just before this one to determine the cause of validation failure. Re-try the update.


811357 Successfully started BV servers on $HOSTNAME.

Description:

Just an informational message that the BV servers on the specified host have started.

Solution:

No action needed.


811463 match_online_key failed strdup for (%s)

Description:

Call to strdup failed. The "strdup" man page describes possible reasons.

Solution:

Install more memory, increase swap space or reduce peak memory consumption.


812706 dl_attach: DL_OK_ACK protocol error

Description:

Could not attach to the private interconnect interface.

Solution:

Reboot of the node might fix the problem.


812742 read: %s

Description:

The rpc.fed server was not able to execute the read system call properly. The message contains the system error. The server will not capture the output from methods it runs.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


813317 Failed to open the cluster handle: %s.

Description:

An internal error occurred while attempting to open a handle for an object.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


813831 reservation warning(%s) - MHIOCSTATUS error will retry in %d seconds

Description:

The device fencing program has encountered errors while trying to access a device. The failed operation will be retried

Solution:

This is an informational message, no user action is needed.


813866 Property %s has no hostnames for resource %s.

Description:

The named property does not have any hostnames set for it.

Solution:

Re-create the named resource with one or more hostnames.


813977 Node %d is listed twice in property %s.

Description:

The node in the message was listed twice in the named property.

Solution:

Specify the property with only one occurrence of the node.


813990 Started the HA-NFS system fault monitor.

Description:

The HA-NFS system fault monitor was started successfully.

Solution:

No action required.


814232 fork() failed: %m.

Description:

The fork() system call failed for the given reason.

Solution:

If system resources are not available, consider rebooting the node.


814420 bind: %s

Description:

Solution:


814905 Could not start up DCS client because major numbers on this node do not match the ones on other nodes. See /var/adm/messages for previous errors.

Description:

Some drivers identified in previous messages do not have the same major number across cluster nodes, and devices owned by the driver are being used in global device services.

Solution:

Look in the /etc/name_to_major file on each cluster node to see if the major number for the driver matches across the cluster. If a driver is missing from the /etc/name_to_major file on some of the nodes, then most likely, the package the driver ships in was not installed successfully on all nodes. If this is the case, install that package on the nodes that don't have it. If the driver exists on all nodes but has different major numbers, see the documentation that shipped with this product for ways to correct this problem.


815551 System property %s with value %s has an empty list element.

Description:

The system property that was named does not have a value for one of its list elements.

Solution:

Assign the property to have a value where all list elements have values.


815833 Malformed property value pair %s.


816002 The port number %d from entry %s in property %s was not for a nonsecure port.

Description:

The Netscape Directory Server instance has been configured as nonsecure, but the port number given in the list property is for a secure port.

Solution:

Remove the the entry from the list or change its port number to correspond to a nonsecure port.


816578 Node %u attempting to join cluster has incompatible cluster software. %s not compatible with %s

Description:

A node is attempting to join the cluster but it is either using an incompatible software version or is booted in a different mode (32-bit vs. 64-bit).

Solution:

Ensure that all nodes have the same clustering software installed and are booted in the same mode.


817592 HA: rma::admin_impl failed to bind

Description:

An HA framework component failed to register with the name server.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


818821 Value %d is listed twice in property %s.

Description:

The value listed occurs twice in the named property.

Solution:

Specify the property with only one occurrence of the value.


818824 HA: rma::reconf can't talk to RM

Description:

An HA framework component failed to register with the Replica Manager.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


818836 Value %s is listed twice in property %s.

Description:

The value listed occurs twice in the named property.

Solution:

Specify the property with only one occurrence of the value.


819642 fatal: unable to register RPC service; aborting node

Description:

The rgmd was unable to start up successfully because it failed to register an RPC service. It will produce a core file and will force the node to halt or reboot.

Solution:

If rebooting the node doesn't fix the problem, examine other syslog messages occurring at about the same time to see if the problem can be identified and if it recurs. Save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance.


819738 Property %s is not set - %s.

Description:

The property has not been set by the user and must be.

Solution:

Reissue the scrgadm command with the required property and value.


819721 Failed to start %s.

Description:

Sun Cluster could not start the application. It would attempt to start the service on another node if possible.

Solution:

1) Check prior syslog messages for specific problems and correct them. 2) This problem may occur when the cluster is under load and Sun Cluster cannot start the application within the timeout period specified. You may consider increasing the Start_timeout property. 3) If the resource was unable to start on any node, resource would be in START_FAILED state. In this case, use scswitch to bring the resource ONLINE on this node. 4) If the service was successfully started on another node, attempt to restart the service on this node using scswitch. 5) If the above steps do not help, disable the resource using scswitch. Check to see that the application can run outside of the Sun Cluster framework. If it cannot, fix any problems specific to the application, until the application can run outside of the Sun Cluster framework. Enable the resource using scswitch. If the application runs outside of the Sun Cluster framework but not in response to starting the data service, contact your authorized Sun service provider for assistance in diagnosing the problem.


820394 Cannot check online status. Server processes are not running.

Description:

HA-Oracle could not check online status of Oracle server. Oracle server processes are not running.

Solution:

Examine 'Connect_string' property of the resource. Make sure that user id and password specified in connect string are correct and permissions are granted to user for connecting to the server. Check whether Oracle server can be started manually. Examine the log files and setup.


821304 Failed to retrieve the resource group information.

Description:

A Sun cluster data service has failed to retrieve the resource group property information. Low memory or API call failure might be the reasons.

Solution:

In case of low memory, the problem will probably cured by rebooting. If the problem reoccurs, you might need to increase swap space by configuring additional swap devices. Otherwise, if it is API call failure, check the syslog messages from other components.


821781 Fencing shared disk groups: %s

Description:

A reservation failfast will be set so nodes which share these disk groups will be brought down if they are fenced off by other nodes.

Solution:

None.


822385 Failed to retrieve process monitor facility tag.

Description:

Failed to create the tag that is used to register with the process monitor facility.

Solution:

Check the syslog messages that occurred just before this message. In case of internal error, save the /var/adm/messages file and contact authorized Sun service provider.


824468 Invalid probe values.

Description:

The values for system defined properties Retry_count and Retry_interval are not consistent with the property Thorough_Probe_Interval.

Solution:

Change the values of the properties to satisfy the following relationship: Thorough_Probe_Interval * Retry_count <= Retry_interval.


824550 clcomm: Invalid flow control parameters

Description:

The flow control policy is controlled by a set of parameters. These parameters do not satisfy guidelines. Another message from validay_policy will have already identified the specific problem.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


824861 Resource %s named in property %s is not a SharedAddress resource.

Description:

The resource given for the named property is not a SharedAddress resource. All resources for that property must be SharedAddress resources.

Solution:

Specify only SharedAddresses for the named property.


825274 idl_scha_control_checkall(): IDL Exception on node <%d>

Description:

During a failover attempt, the scha_control function was unable to check the health of the indicated node, because of an error in inter-node communication. This was probably caused by the death of the indicated node during scha_control execution. The RGM will still attempt to master the resource group on another node, if available.

Solution:

No action is required; the rgmd should recover automatically. Identify what caused the node to die by examining syslog output. The syslog output might indicate further remedial actions.


826050 Failed to retrieve the cluster property %s for %s: %s.

Description:

The query for a property failed. The reason for the failure is given in the message.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


826353 Unable to open /dev/console: %s

Description:

While starting up, one of the rgmd daemons was not able to open /dev/console. The message contains the system error. This will prevent the daemon from starting on this node.

Solution:

Examine other syslog messages occurring at about the same time to see if the problem can be identified. Save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance in diagnosing and correcting the problem.


826397 Invalid values for probe related parameters.

Description:

Validation of the probe related parameters is failed. Invalid values are specified for these parameters.

Solution:

Retry_interval must be greater than or equal to the product of Thorough_probe_interval, and Retry_count. Use scrgadm(1M) to modify the values of these parameters so that they will hold the above relationship.


826747 reservation error(%s) - do_scsi3_inkeys() error for disk %s

Description:

The device fencing program has encountered errors while trying to access a device. All retry attempts have failed.

Solution:

For the user action required by this message, see the user action for message 192619.


827525 reservation message(%s) - Fencing other node from disk %s

Description:

The device fencing program is taking access to the specified device away from a non-cluster node.

Solution:

This is an informational message, no user action is needed.


828140 Starting %s.

Description:

Sun Cluster is starting the specified application.

Solution:

This is an informational message, no user action is needed.


828170 CCR: Unrecoverable failure during updating table %s.

Description:

CCR encountered an unrecoverable error while updating the indicated table on this node.

Solution:

The node needs to be rebooted. Also contact your authorized Sun service provider to determine whether a workaround or patch is available.


828171 stat of file %s failed.

Description:

Status of the named file could not be obtained.

Solution:

Check the permissions of the file and all components in the path prefix.


828283 clconf: No memory to read quorum configuration table

Description:

Could not allocate memory while converting the quorum configuration information into quorum table.

Solution:

This is an unrecoverable error, and the cluster needs to be rebooted. Also contact your authorized Sun service provider to determine whether a workaround or patch is available.


828407 WARNING: lkcm_sync failed: unknown message type %d

Description:

An message of unknown type was sent to udlm. This will be ignored.

Solution:

None.


828474 resource group %s property changed.

Description:

This is a notification from the rgmd that the operator has edited a property of a resource group. This may be used by system monitoring tools.

Solution:

This is an informational message, no user action is needed.


828739 transition '%s' timed out for cluster, forcing reconfiguration.

Description:

Step transition failed. A reconfiguration will be initiated.

Solution:

Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.


829117 scha_control GIVEOVER failed. error %d

Description:

Fault monitor had detected problems in RDBMS server. Attempt to switchover resource to another node failed. Error returned by API call scha_control is indicated in the message.

Solution:

None.


829132 scha_control GIVEOVER failed. error %d

Description:

Fault monitor had detected problems in RDBMS server. Attempt to switchover resource to another node failed. Error returned by API call scha_control is indicated in the message.

Solution:

None.


829262 Switchover (%s) error: cannot find clexecd

Description:

The file system specified in the message could not be hosted on the node the message came from. Check to see if the user program "clexecd" is running on that node.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


829384 INTERNAL ERROR: launch_method: state machine attempted to launch invalid method <%s> (method <%d>) for resource <%s>; aborting node

Description:

An internal error occurred when the rgmd attempted to launch an invalid method for the named resource. The rgmd will produce a core file and will force the node to halt or reboot.

Solution:

Look for other syslog error messages on the same node. Save a copy of the /var/adm/messages files on all nodes, and report the problem to your authorized Sun service provider.


830211 Failed to accept connection on socket: %s.

Description:

While determining the health of the data service, fault monitor is failed to communicate with the process monitor facility.

Solution:

This is internal error. Save /var/adm/messages file and contact your authorized Sun service provider. For more details about error, check the syslog messges.


831036 Service object [%s, %s, %d] created in group '%s'

Description:

A specific service known by its unique name SAP (service access point), the three-tuple, has been created in the designated group.

Solution:

This is an informational message, no user action is needed.


833126 Monitor server successfully started.

Description:

The Sybase Monitor server has beensuccessfully started by Sun Cluster HA for Sybase.

Solution:

This is an information message, no user action is needed.


833212 Attempting to start the data service under process monitor facility.

Description:

The function is going to request the PMF to start the data service. If the request fails, refer to the syslog messages that appear after this message.

Solution:

This is an informational message, no user action is required.


833229 Couldn't remove deleted directory file, '%s' error: (%d)

Description:

The file system is unable to create temporary copies of deleted files.

Solution:

Mount the affected file system as a local file system, and ensure that there is no file system entry with name "._" at the root level of that file system. Alternatively, run fsck on the device to ensure that the file system is not corrupt.


833729 Setup error. Unable to monitor database.

Description:

Fault monitor is unable to continue with database monitoring. This error can be result of incorrect setup such as wrong password for fault monitor user, incorrect database access permissions, or internal errors in fault monitor. Fault monitor is stopped after logging this syslog message. More information and error codes are available in other syslog messages are logged by the fault monitor prior to this message.

Solution:

Check syslog messages logged by the fault monitor. After correcting the setup, fault monitor can be started as follows: scrgadm -n -M -j <resource> scrgadm -e -M -j <resource>


833970 clcomm: getrlimit(RLIMIT_NOFILE): %s

Description:

During cluster initialization within this user process, the getrlimit call failed with the specified error.

Solution:

Read the man page for getrlimit for a more detailed description of the error.


834530 Failed to parse xml: invalid element %s

Description:

Solution:


834589 Error while executing scsblconfig.

Description:

There was an error while attempting to execute (source) the specified file. This may be due to improper permissions, or improper settings in this file.

Solution:

Please verify that the file has correct permissions. If permissions are correct, verify all the settings in this file. Try to manually source this file in korn shell ('. scsblconfig'), and correct any errors.


836593 Received a connect request from a node not configured in the cluster. Nodeid %u ipaddr 0x%x

Description:

CCR tables are temporarily out of sync.


837169 Starting listener %s.

Description:

Informational message. HA-Oracle will be starting Oracle listener.

Solution:

None


837211 Resource is already online.

Description:

While attempting to restart the resource, error has occurred. The resource is already online.

Solution:

This is an internal error. Save the /var/adm/messages file from all the nodes. Contact your authorized Sun service provider.


837752 Failed to retrieve the resource group handle for %s while querying for property %s: %s.

Description:

Access to the object named failed. The reason for the failure is given in the message.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


837760 monitored processes forked failed (errno=%d)

Description:

The rpc.pmfd server was not able to start (fork) the application, probably due to low memory, and the system error number is shown. An error message is output to syslog.

Solution:

Investigate if the machine is running out of memory. If this is not the case, save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


838695 Unable to process client registration

Description:

Solution:


839641 t_alloc (reqp): %s

Description:

Call to t_alloc() failed. The "t_alloc" man page describes possible error codes. udlm will exit and the node will abort.

Solution:

This is an internal error. Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.


839649 t_alloc (resp): %s

Description:

Call to t_alloc() failed. The "t_alloc" man page describes possible error codes. udlm will exit and the node will abort.

Solution:

This is an internal error. Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.


839936 Some ip addresses may not be plumbed.

Description:

Some of the ip addresses managed by the LogicalHostname resource were not successfully brought on-line on this node.

Solution:

Use ifconfig command to make sure that the ip addresses are indeed absent. Check for any error message before this error message for a more precise reason for this error. Use scswitch to move the resource group to some other node.


839881 Media error encountered, but Auto_end_bkp failed.

Description:

The HA-Oracle start method identified that one or more datafiles is in need of recovery. This was caused by the file(s) being left in hot backup mode. The Auto_end_bkp extension property is enabled, but failed to recover the database.

Solution:

Examine the log files for the cause of the failure to recover the database.


840542 OFF_PENDING_BOOT: bad resource state <%s> (%d) for resource <%s>

Description:

The rgmd state machine has discovered a resource in an unexpected state on the local node. This should not occur and may indicate an internal logic error in the rgmd.

Solution:

Look for other syslog error messages on the same node. Save a copy of the /var/adm/messages files on all nodes, and report the problem to your authorized Sun service provider.


840619 Invalid value was returned for resource group property %s for %s.

Description:

The value returned for the named property was not valid.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


840696 DNS database directory %s is not readable: %s

Description:

The DNS database directory is not readable. This may be due to the directory not existing or the permissions not being set properly.

Solution:

Make sure the directory exists and has read permission set appropriately. Look at the prior syslog messages for any specific problems and correct them.


841616 CMM: This node has been preempted from quorum device %s.

Description:

This node's reservation key was on the specified quorum device, but is no longer present, implying that this node has been preempted by another cluster partition. If a cluster gets divided into two or more disjoint subclusters, exactly one of these must survive as the operational cluster. The surviving cluster forces the other subclusters to abort by grabbing enough votes to grant it majority quorum. This is referred to as preemption of the losing subclusters.

Solution:

There may be other related messages that may indicate why quorum was lost. Determine why quorum was lost on this node, resolve the problem and reboot this node.


841719 listener %s is not running. restart limit reached. Stopping fault monitor.

Description:

Listener is not running. Listener monitor has reached the restart limit specified in 'Retry_count' and 'Retry_interval' properties. Listener monitor will be stopped.

Solution:

Check Oracle listener setup. Please make sure that Listener_name specified in the resource property is configured in listener.ora file. Check 'Host' property of listener in listener.ora file. Examine log file and syslog messages for additional information. Stop and start listener monitor.


841875 remote node died

Description:

An inter-node communication failed because another cluster node died.

Solution:

No action is required. The cluster will reconfigure automatically. Examine syslog output on the rebooted node to determine the cause of node death.


842059 Cannot create monitor child process. fork failed with %m

Description:

Fault monitor is not able to create child process. Fault monitor will be restarted. If problem persists, fault monitor will be stopped.


842313 clexecd: Sending fd on common channel returned %d. Exiting.

Description:

clexecd program has encountered a failed fcntl(2) system call. The error message indicates the error number for the failure.

Solution:

The node will halt or reboot itself to prevent data corruption. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


842382 fcntl: %s

Description:

A server (rpc.pmfd or rpc.fed) was not able to execute the action shown, and the process associated with the tag is not started. The error message is shown.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


842712 clcomm: solaris xdoor door_create failed

Description:

A door_create operation failed. Refer to the "door_create" man page for more information.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


843070 Failed to disconnect from port %d of resource %s.

Description:

Unable to connect to the port at hostname and port.

Solution:

If the problem persists Sun Cluster will restart or failover the resource.


843013 Data service failed to stay up. Start method failed.

Description:

The data service may have failed to startup completely.

Solution:

Look in /var/adm/messages for the cause of failure. Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


843070 Failed to disconnect from port %d of resource %s.

Description:

An error occurred while fault monitor attempted to disconnect from the specified hostname and port.

Solution:

Wait for the fault monitor to correct this by doing restart or failover. For more error descriptions, look at the syslog messages.


843093 fatal: Got error <%d> trying to read CCR when enabling monitor of resource <%s>; aborting node

Description:

Rgmd failed to read updated resource from the CCR on this node.

Solution:

Save a copy of the /var/adm/messages files on all nodes, and of the rgmd core file. Contact your authorized Sun service provider for assistance in diagnosing the problem.


843876 Media error encountered, and Auto_end_bkp was successful.

Description:

The HA-Oracle start method identified that one or more datafiles was in need of recovery. This was caused by the file(s) being left in hot backup mode. The Auto_end_bkp extension property is enabled, and successfuly recovered and opened the database.

Solution:

None. This is an informational message. Oracle server is online.


843978 Socket creation failed: %s.

Description:

System is unable to create a socket.

Solution:

This might be the result from the lack of system resources. Check whether the system is low in memory and take appropriate action. For specific error information check the syslog message.


843983 CMM: Node %s: attempting to join cluster.

Description:

The specified node is attempting to become a member of the cluster.

Solution:

This is an informational message, no user action is needed.


845866 Failover attempt failed: %s.

Description:

The failover attempt of the resource is rejected or encountered an error.

Solution:

For more detailed error message, check the syslog messages. Check whether the Pingpong_interval has appropriate value. If not, adjust it using scrgadm(1M). Otherwise, use scswitch to switch the resource group to a healthy node.


845977 Failed to parse xml: low memory

Description:

Solution:


846053 Fast path enable failed on %s%d, could cause path timeouts

Description:

DLPI fast path could not be enabled on the device.

Solution:

Check if the right version of the driver is in use.


846376 fatal: Got error <%d> trying to read CCR when making resource group <%s> unmanaged; aborting node

Description:

Rgmd failed to read updated resource from the CCR on this node.

Solution:

Save a copy of the /var/adm/messages files on all nodes, and of the rgmd core file. Contact your authorized Sun service provider for assistance in diagnosing the problem.


846420 CMM: Nodes %ld and %ld are disconnected from each other; node %ld will abort using %s rule.

Description:

Due to a connection failure between the two specified non-local nodes, one of the nodes must be halted to avoid a "split brain" configuration. The CMM used the specified rule to decide which node to fail. Rules are: rebootee: If one node is rebooting and the other was a member of the cluster, the node that is rebooting must abort. quorum: The node with greater control of quorum device votes survives and the other node aborts. node number: The node with higher node number aborts.

Solution:

The cause of the failure should be resolved and the node should be rebooted if node failure is unexpected.


846813 Switchover (%s) error (%d) converting to primary

Description:

The file system specified in the message could not be hosted on the node the message came from.

Solution:

Check /var/adm/messages to make sure there were no device errors. If not, contact your authorized Sun service provider to determine whether a workaround or patch is available.


847065 Failed to start listener %s.

Description:

Failed to start Oracle listener.

Solution:

Check Oracle listener setup. Please make sure that Listener_name specified in the resource property is configured in listener.ora file. Check 'Host' property of listener in listener.ora file. Examine log file and syslog messages for additional information.


847124 getnetconfigent: %s

Description:

call to getnetconfigent in udlm port setup failed.udlm fails to start and the node will eventually panic.

Solution:

This is an internal error. Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.


847496 CMM: Reading reservation keys from quorum device %s failed with error %d.

Description:

The specified error was encountered while trying to read reservation keys on the specified quorum device.

Solution:

There may be other related messages on this and other nodes connected to this quorum device that may indicate the cause of this problem. Refer to the quorum disk repair section of the administration guide for resolving this problem.


847656 Command %s is not executable.

Description:

The specified pathname, which was passed to a libdsdev routine such as scds_timerun or scds_pmf_start, does not refer to an executable file. This could be the result of 1) mis-configuring the name of a START or MONITOR_START method or other property, 2) a programming error made by the resource type developer, or 3) a problem with the specified pathname in the file system itself.

Solution:

Ensure that the pathname refers to a regular, executable file.


847809 Must be in cluster to start %s

Description:

Machine on which this command or daemon is running is not part of a cluster.

Solution:

Run the command on another machine or make the machine is part of a cluster by following appropriate steps.


847916 (%s) netdir error: uaddr2taddr: %s

Description:

Call to uaddr2taddr() failed. The "uaddr2taddr" man page describes possible error codes. udlmctl will exit.

Solution:

Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.


847978 reservation fatal error(UNKNOWN) - cluster_get_quorum_status() error, returned %d

Description:

The device fencing program has suffered an internal error.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available. Copies of /var/adm/messages from all nodes should be provided for diagnosis. It may be possible to retry the failed operation, depending on the nature of the error. If the message specifies the 'node_join' transition, then this node may be unable to access shared devices. If the failure occurred during the 'release_shared_scsi2' transition, then a node which was joining the cluster may be unable to access shared devices. In either case, it may be possible to reacquire access to shared devices by executing '/usr/cluster/lib/sc/run_reserve -c node_join' on all cluster nodes. If the failure occurred during the 'make_primary' transition, then a device group has failed to start on this node. If another node was available to host the device group, then it should have been started on that node. If desired, it may be possible to switch the device group to this node with the scswitch command. If no other node was available, then the device group will not have been started. The scswitch command may be used to retry the attempt to start the device group. If the failure occurred during the 'primary_to_secondary' transition, then the shutdown or switchover of a device group has failed. The desired action may be retried.


847994 Plumb failed. tried to unplumb %s%d, unplumb failed with rc %d

Description:

Topology Manager failed to plumb an adapter for private network. A possible reason for plumb to fail is that it is already plumbed. Solaris Clustering tries to unplumb the adapter and plumb it for private use but it could not unplumb the adapter.

Solution:

Check if the adapter by that name exists.


848033 SharedAddress online.

Description:

The status of the sharedaddress resource is online.

Solution:

This is informational message. No user action required.


848652 CMM aborting.

Description:

The node is going down due to a decision by the cluster membership monitor.

Solution:

This message is preceded by other messages indicating the specific cause of the abort, and the documentation for these preceding messages will explain what action should be taken. The node should be rebooted if node failure is unexpected.


848854 Failed to retrieve WLS extension properties.

Description:

The WLS Extension properties could not be retrieved.

Solution:

Check for other messages in syslog and /var/adm/messages for details of failure.


848943 clconf: No valid gdevname field for quorum device %d

Description:

Found the gdevname field for the quorum device being incorrect while converting the quorum configuration information into quorum table.

Solution:

Check the quorum configuration information.


849856 sigemptyset: %s

Description:

The rpc.pmfd server was not able to initialize a signal set. The message contains the system error. This happens while the server is starting up, at boot time. The server does not come up, and an error message is output to syslog.

Solution:

Save the syslog messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


850108 Validation failed. PARAMETER_FILE: %s does not exist

Description:

Oracle parameter file (typically init<sid>.ora) specified in property 'Parameter_file' does not exist or is not readable.

Solution:

Please make sure that 'Parameter_file' property is set to the existing Oracle parameter file. Reissue command to create/update


852212 reservation message(%s) - Taking ownership of disk %s away from non-cluster node

Description:

The device fencing program is taking access to the specified device away from a non-cluster node.

Solution:

This is an informational message, no user action is needed.


852497 scvxvmlg error - readlink(%s) failed

Description:

The program responsible for maintaining the VxVM namespace was unable to access the global device namespace. If configuration changes were recently made to VxVM diskgroups or volumes, this node may be unaware of those changes. Recently created volumes may be unaccessible from this node.

Solution:

Verify that the /global/.devices/node@N (N = this node's node number) is mounted globally and is accessible. If no configuration changes have been recently made to VxVM diskgroups or volumes and all volumes continue to be accessible from this node, then no further action is required. If changes have been made, the device namespace on this node can be updated to reflect those changes by executing '/usr/cluster/lib/dcs/scvxvmlg'. If the problem persists, contact your authorized Sun service provider to determine whether a workaround or patch is available.


852615 reservation error(%s) - Unable to gain access to device '%s'

Description:

The device fencing program has encountered errors while trying to access a device.

Solution:

Another cluster node has fenced this node from the specified device, preventing this node from accessing that device. Access should have been reacquired when this node joined the cluster, but this must have experienced problems. If the message specifies the 'node_join' transition, this node will be unable to access the specified device. If the failure occurred during the 'make_primary' transition, then this will be unable to access the specified device and a device group containing the specified device may have failed to start on this node. An attempt can be made to acquire access to the device by executing '/usr/cluster/lib/sc/run_reserve -c node_join' on this node. If a device group failed to start on this node, the scswitch command can be used to start the device group on this node if access can be reacquired. If the problem persists, please contact your authorized Sun service provider to determine whether a workaround or patch is available.


853478 Received non interrupt heartbeat on %s - path timeouts are likely.

Description:

Solaris Clustering requires network drivers to deliver heartbeat messages in the interrupt context. A heartbeat message has unexpectedly arrived in non interrupt context.

Solution:

Check if the right version of the driver is in use.


853956 INTERNAL ERROR: WLS extension properties structure is NULL.

Description:

This is an internal Error.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


854468 failfast arm error: %d

Description:

Error during failfast device arm operation.

Solution:

This is an internal error. Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.


854792 clcomm: error in copyin for cl_change_threads_min

Description:

The system failed a copy operation supporting a flow control state change.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


854894 No LogicalHostname resource in resource group.

Description:

The probe method for this data service could not find a LogicalHostname resource in the same resource group as the data service.

Solution:

Use scrgadm to configure the resource group to hold both the data service and the LogicalHostname.


856492 waitpid() failed: %m.

Description:

The waitpid() system call failed for the given reason.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


856880 Desired_primaries for resource group %s should be 0. Current value is %d.

Description:

The number of desired primaries for this resource group to be zero. In the event that a node dies or joins the cluster, the resource group might come online on some node, even if it was previously switched offline, and was intended to remain offline.

Solution:

Set the Desired_primaries property for the resource group to zero.


856919 INTERNAL ERROR: process_resource: resource group <%s> is pending_methods but contains resource <%s> in STOP_FAILED state

Description:

During a resource creation, deletion, or update, the rgmd has discovered a resource in STOP_FAILED state. This may indicate an internal logic error in the rgmd, since updates are not permitted on the resource group until the STOP_FAILED error condition is cleared.

Solution:

Look for other syslog error messages on the same node. Save a copy of the /var/adm/messages files on all nodes, and report the problem to your authorized Sun service provider.


857573 scvxvmlg error - rmdir(%s) failed

Description:

The program responsible for maintaining the VxVM namespace was unable to access the global device namespace. If configuration changes were recently made to VxVM diskgroups or volumes, this node may be unaware of those changes. Recently created volumes may be unaccessible from this node.

Solution:

Verify that the /global/.devices/node@N (N = this node's node number) is mounted globally and is accessible. If no configuration changes have been recently made to VxVM diskgroups or volumes and all volumes continue to be accessible from this node, then no further action is required. If changes have been made, the device namespace on this node can be updated to reflect those changes by executing '/usr/cluster/lib/dcs/scvxvmlg'. If the problem persists, contact your authorized Sun service provider to determine whether a workaround or patch is available.


857792 UNIX DLM initiating cluster abort.

Description:

Due to an error encountered, unix dlm is initiating an abort.

Solution:

This is an internal error. Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.


858256 Stopping %s with command %s.

Description:

Sun Cluster is stopping the specified application with the specified command.

Solution:

This is an informational message, no user action is needed.


859126 System property %s is empty.

Description:

The system property that was named does not have a value.

Solution:

Assign the property a value.


861738 Error: unknown error code\n

Description:

Solution:


862493 in libsecurity could not register on any transport in NETPATH

Description:

A server (rpc.pmfd, rpc.fed or rgmd) was not able to start because it could not establish a rpc connection for the network specified, because it couldn't find any transport. An error message is output to syslog. This happened because either there are no available transports at all, or there are but none is a loopback.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


862716 sema_init: %s

Description:

The rpc.pmfd server was not able to initialize a semaphore, possibly due to low memory, and the system error is shown. The server does not perform the action requested by the client, and pmfadm returns error. An error message is also output to syslog.

Solution:

Investigate if the machine is running out of memory. If this is not the case, save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


862999 Siebel server components maybe unavailable or offline. No action will be taken.

Description:

Not all of the enabled Siebel server components are running.

Solution:

This is an informative message. Fault Monitor will not take any action. Please manually start the Siebel component(s) that may have gone down to ensure complete service.


865183 Cannot open pipe to child process. pipe() failed with %m

Description:

Fault monitor is not able to communicate to it's child process. Fault monitor will be restarted. If problem persists, fault monitor will be stopped.


865292 File %s should be owned by %s.

Description:

A program required the specified file to be owned by the specified user.

Solution:

Use chown command to change to owner as suggested.


865635 lkcm_act: caller is not registered

Description:

udlm is not currently registered with ucmm.

Solution:

This is an internal error. Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.


866371 The listener %s is not running; retry_count <%s> exceeded. Attempting to switchover resource group.

Description:

Listener is not running. Listener monitor has reached the restart limit specified in 'Retry_count' and 'Retry_interval' properties. Listener and the resource group will be moved to another node.

Solution:

Check Oracle listener setup. Please make sure that Listener_name specified in the resource property is configured in listener.ora file. Check 'Host' property of listener in listener.ora file. Examine log file and syslog messages for additional information.


866624 clcomm: validate_policy: threads_low not big enough low %d pool %d

Description:

The system checks the proposed flow control policy parameters at system startup and when processing a change request. The low server thread level must not be less than twice the thread increment level for resource pools whose number threads varies dynamically.

Solution:

No user action required.


867059 Could not shutdown replica for device service (%s). Some file system replicas that depend on this device service may already be shutdown. Future switchovers to this device service will not succeed unless this node is rebooted.

Description:

See message.

Solution:

If mounts or node reboots are on at the time this message was displayed, wait for that activity to complete, and then retry the command to shutdown the device service replica. If not, then contact your authorized Sun service provider to determine whether a workaround or patch is available.


868245 Unable to process dbms log file.

Description:

Error occurred when processing DBMB log file. As a result of this error, fault monitor could not scan errors from log file. This error can occur as a result of memory allocation problems.


868467 Process %s did not die in %d seconds.

Description:

HA-NFS attempted to stop the specified process id but was unable to stop the process in a timely fashion. Since HA-NFS uses the SIGKILL signal to kill processes, this indicates a serious overload or kernelproblem with the system.

Solution:

HA-NFS would take appropiate action. If this error occurs in a STOP method, the node would be rebooted. Increase timeout on the appropiate method.


869196 Failed to get IPMP status for group %s (request failed with %d).

Description:

A query to get the state of a IPMP group failed. This may cause a method failure to occur.

Solution:

Make sure the network monitoring daemon (pnmd) is running. Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


869406 Failed to communicate with server %s port %d: %s.

Description:

The data service fault monitor probe was trying to read from or write to the service specified and failed. Sun Cluster will attempt to correct the situation by either doing a restart or a failover of the data service. The problem may be due to an overloaded system or other problems, causing a timeout to occur before communications could be completed.

Solution:

If this problem is due to an overloaded system, you may consider increasing the Probe_timeout property.


870181 Failed to retrieve the resource handle for %s while querying for property %s: %s.

Description:

Access to the object named failed. The reason for the failure is given in the message.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


870317 INTERNAL ERROR: START method is not registered for resource <%s>

Description:

A non-fatal internal error has occurred in the rgmd state machine.

Solution:

Since this problem might indicate an internal logic error in the rgmd, please save a copy of the /var/adm/messages files on all nodes, the output of an scstat -g command, and the output of a scrgadm -pvv command. Report the problem to your authorized Sun service provider.


870566 clutil: Scheduling class %s not configured

Description:

An attempt to change the thread scheduling class failed, because the scheduling class was not configured.

Solution:

Configure the system to support the desired thread scheduling class.


871642 Validation failed. Invalid command line %s %s

Description:

Unable to process parameters passed to the call back method. This is an internal error.

Solution:

Please report this problem.


872086 Service is degraded.

Description:

Probe is detected a failure in the data service. Probe is setting resource's status as degraded.

Solution:

Wait for the fault monitor to restart the data service. Check the syslog messages and configuration of the data service.


872599 Error in getting service name for device path <%s>

Description:

Can not map the device path to a valid global service name.

Solution:

Check the path passed into extension property "ServicePaths" of SUNW.HAStorage type resource.


872839 Resource is already stopped.

Description:

Sun Cluster attempted to stop the resource, but found it already stopped.

Solution:

No user action required.


874012 Command %s timed out. Will continue to start up liveCache.

Description:

The listed command timed out. Will continue to start up liveCache.

Solution:

Informative message. HA-liveCache will continue to start up liveCache. No immediate action is required. This could be caused by heavy system load. However, if the system load is not heavy, user should check the installation and configuration of liveCache. Make sure the same listed command can be ran manually on the system.


874167 Multi-IP group '%s' updated

Description:

The Multi-IP group by that name is modified.

Solution:

This is an informational message, no user action is needed.


874879 clcomm: Path %s being deleted

Description:

A communication link is being removed with another node. The interconnect may have failed or the remote node may be down.

Solution:

Any interconnect failure should be resolved, and/or the failed node rebooted.


875171 clcomm: Pathend %p: %d is not a pathend state

Description:

The system maintains state information about a path. The state information is invalid.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


875345 None of the shared paths in file %s are valid.

Description:

All the paths specified in the dfstab.<resource_name> file are invalid.

Solution:

Check that those paths are valid. This might be a result of the underlying disk failure in an unavailable file system. The monitor_check method would thus fail and the HA-NFS resource would not be brought online on this node. However, it is advisable that the file system be brought online soon.


875595 CMM: Shutdown timer expired. Halting.

Description:

The node could not complete its shutdown sequence within the halt timeout, and is aborting to enable another node to safely take over its services.

Solution:

This is an informational message, no user action is needed.


875796 CMM: Reconfiguration callback timed out; node aborting.

Description:

One or more CMM client callbacks timed out and the node will be aborted.

Solution:

There may be other related messages on this node which may help diagnose the problem. Resolve the problem and reboot the node if node failure is unexpected. If unable to resolve the problem, contact your authorized Sun service provider to determine whether a workaround or patch is available.


875939 ERROR: Failed to initialize callbacks for Global_resources_used, error code <%d>

Description:

The rgmd encountered an error while trying to initialize the Global_resources_used mechanism on this node. This is not considered a fatal error, but probably means that method timeouts will not be suspended while a device service is failing over. This could cause unneeded failovers of resource groups when device groups are switched over.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem. This error might be cleared by rebooting the node.


876090 fatal: must be superuser to start %s

Description:

The rgmd can only be executed by the super-user.

Solution:

This probably occurred because a non-root user attempted to start the rgmd manually. Normally, the rgmd is started automatically when the node is booted.


876324 CCR: CCR transaction manager failed to register with the cluster HA framework.

Description:

The CCR transaction manager failed to register with the cluster HA framework.

Solution:

This is an unrecoverable error, and the cluster needs to be rebooted. Also contact your authorized Sun service provider to determine whether a workaround or patch is available.


876485 No execute permissions to the file %s.

Description:

The execute permissions to the specified file are not set.

Solution:

Set the execute permissions to this file.


876834 Could not start server

Description:

HA-Oracle failed to start Oracle server. Syslog messages and log file will provide additional information on possible reasons of failure.

Solution:

Check whether Oracle server can be started manually. Examine the log files and setup.


877905 ff_ioctl: %s

Description:

A server (rpc.pmfd or rpc.fed) was not able to arm or disarm the failfast device, which ensures that the host aborts if the server dies. The error message is shown. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


878089 fatal: realloc: %s (UNIX error %d)

Description:

The rgmd failed to allocate memory, most likely because the system has run out of swap space. The rgmd will produce a core file and will force the node to halt or reboot to avoid the possibility of data corruption.

Solution:

The problem was probably cured by rebooting. If the problem recurs, you might need to increase swap space by configuring additional swap devices. See swap(1M) for more information.


878135 WARNING: udlm_update_from_saved_msg

Description:

There is no saved message to update udlm.

Solution:

None. This is a warning only.


878447 Multi-IP group '%s' created

Description:

A Multi-IP group by that name is created.

Solution:

This is an informational message, no user action is needed.


879106 Failed to complete command %s. Will continue to start up liveCache.

Description:

The listed command failed to complete. HA-liveCache will continue to start up liveCache.

Solution:

Look for other syslog error messages on the same node. Save a copy of the /var/adm/messages files on all nodes, and report the problem to your authorized Sun service provider.


879380 pmf_monitor_children: Error stopping <%s>: %s

Description:

An error occured while rpc.pmfd attempted to send a KILL signal to one of the processes of the given tag. The reason for the failure is also given. rpc.pmfd attempted to kill the process because a previous error occured while creating a monitor process for the process to be monitored.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


879511 reservation fatal error(%s) - service_class not specified

Description:

The device fencing program has suffered an internal error.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available. Copies of /var/adm/messages from all nodes should be provided for diagnosis. It may be possible to retry the failed operation, depending on the nature of the error. If the message specifies the 'node_join' transition, then this node may be unable to access shared devices. If the failure occurred during the 'release_shared_scsi2' transition, then a node which was joining the cluster may be unable to access shared devices. In either case, it may be possible to reacquire access to shared devices by executing '/usr/cluster/lib/sc/run_reserve -c node_join' on all cluster nodes. If the failure occurred during the 'make_primary' transition, then a device group has failed to start on this node. If another node was available to host the device group, then it should have been started on that node. If desired, it may be possible to switch the device group to this node with the scswitch command. If no other node was available, then the device group will not have been started. The scswitch command may be used to retry the attempt to start the device group. If the failure occurred during the 'primary_to_secondary' transition, then the shutdown or switchover of a device group has failed. The desired action may be retried.


880317 scvxvmlg fatal error - %s does not exist, VxVM not installed?

Description:

The program responsible for maintaining the VxVM namespace was unable to access the local VxVM device namespace. If configuration changes were recently made to VxVM diskgroups or volumes, this node may be unaware of those changes. Recently created volumes may be unaccessible from this node.

Solution:

If VxVM is used to manage shared device, it must be installed on all cluster nodes. If VxVM is installed on this node, but the local VxVM namespace does not exist, VxVM may have to be re-installed on this node. If VxVM is installed on this node and the local VxVM device namespace does exist, the namespace management can be manually run on this node by executing '/usr/cluster/lib/dcs/scvxvmlg' on this node. If the problem persists, please contact your authorized Sun service provider to determine whether a workaround or patch is available. If VxVM is not being used on this cluster, then no user action is required.


880651 No hostnames specified.

Description:

An attempt was made to create a Network resource without specifying a hostname.

Solution:

At least one hostname must be specified via tha -l option to scrgadm(1M).


880835 pmf_search_children: Error stopping <%s>: %s

Description:

An error occured while rpc.pmfd attempted to send a KILL signal to one of the processes of the given tag. The reason for the failure is also given. rpc.pmfd attempted to kill the process because a previous error occured while creating a monitor process for the process to be monitored.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


883690 Failed to start Monitor server.

Description:

Sun Cluster HA for Sybase failed to start the monitor server. Other syslog messages and the log file will provide additional information on possible reasons for the failure.

Solution:

Please whether the server can be started manually. Examine the HA-Sybase log files, monitor server log files and setup.


884114 clcomm: Adapter %s constructed

Description:

A network adapter has been initialized.

Solution:

No action required.


884438 A component of NFS did not start completely in %d seconds: prognum %lu, progversion %lu.

Description:

A daemon associated with NFS service did not finish registering with RPC within the specified timeout.

Solution:

Increase the timeout associated with the method during which this failure occurred.


884482 clconf: Quorum device ID %ld is invalid. The largest supported ID is %ld

Description:

Found the quorum device ID being invalid while converting the quorum configuration information into quorum table.

Solution:

Check the quorum configuration information.


884821 Unparsable registration

Description:

Solution:


884823 Prog <%s> step <%s>: stat of program file failed.

Description:

A step points to a file that is not executable. This may have been caused by incorrect installation of the package.

Solution:

Identify the program for the step. Check the permissions on the program. Reinstall the package if necessary.


884979 (%s) aborting, but got a message of type %d

Description:

Going through udlm abort and received an unexpected message of the mentioned type.

Solution:

None.


887138 Extension property <Child_mon_level> has a value of <%d>

Description:

Resource property Child_mon_level is set to the given value.

Solution:

This is an informational message, no user action is needed.


887282 Mode for file %s needs to be %03o

Description:

The file needs to have the indicated mode.

Solution:

Set the mode of the file correctly.


887666 clcomm: sxdoor: op %d fcntl failed: %s

Description:

A user level process is unmarshalling a door descriptor and creating a new door. The specified operation on the fcntl operation fails. The "fcntl" man page describes possible error codes.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


887669 clcomm: coalesce_region request(%d) > MTUsize(%d)

Description:

While supporting an invocation, the system wanted to create one buffer that could hold the data from two buffers. The system cannot create a big enough buffer. After generating another system error message, the system will panic. This message only appears on debug systems.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


888259 clcomm: Path %s being deleted and cleaned

Description:

A communication link is being removed with another node. The interconnect may have failed or the remote node may be down.

Solution:

Any interconnect failure should be resolved, and/or the failed node rebooted.


889303 Failed to read from kstat:%s

Description:

See 176151

Solution:

See 176151


889884 scha_control RESTART failed. error %d

Description:

Fault monitor had detected problems in RDBMS server. Attempt to restart RDBMS server on the same node failed. Error returned by API call scha_control is indicated in the message.

Solution:

None.


889899 scha_control RESTART failed. error %d

Description:

Fault monitor had detected problems in RDBMS server. Attempt to restart RDBMS server on the same node failed. Error returned by API call scha_control is indicated in the message.

Solution:

None.


890129 dl_attach: DL_ERROR_ACK access error

Description:

Could not attach to the physical device. We are trying to open a fast path to the private transport adapters.

Solution:

Reboot of the node might fix the problem


890413 %s: state transition from %s to %s

Description:

A state transition has happened for the IPMP group. Transition to DOWN happens when all adapters in an IPMP group are determined to be faulty.

Solution:

If an IPMP group transitions to DOWN state, check for error messages about adapters being faulty and take suggested user actions accordingly. No user user action is needed for other state transitions.


890927 HA: repl_mgr_impl: thr_create failed

Description:

The system could not create the needed thread, because there is inadequate memory.

Solution:

There are two possible solutions. Install more memory. Alternatively, reduce memory usage.


891362 scha_resource_open error (%d)

Description:

Error occurred in API call scha_resource_open.

Solution:

Check syslog messages for errors logged from other system modules. Stop and start fault monitor. If error persists then disable fault monitor and report the problem.


891424 Starting %s with command %s.

Description:

Sun Cluster is starting the specified application with the specified command.

Solution:

This is an informational message, no user action is needed.


891462 in libsecurity caller is %d, not the desired uid %d

Description:

A server (rpc.pmfd, rpc.fed or rgmd) refused an rpc connection from a client because it has the wrong uid. The actual and desired uids are shown. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


892183 libsecurity: NULL RPC to program %ld failed will not retry %s

Description:

A client of the rpc.pmfd, rpc.fed or rgmd server was not able to initiate an rpc connection, because it could not execute a test rpc call. The program will not retry because the time limit of 1 hr was exceeded. The message shows the specific rpc error. The program number is shown. To find out what program corresponds to this number, use the rpcinfo command. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


893019 %d-bit saposcol is running on %d-bit Solaris.

Description:

The architecture of saposcol is not compatable to the current running Solaris version. For example, you have a 64-bit saposcol running on a 32-bit Solaris machine or vice verse.

Solution:

Make sure the correct saposcol is installed on the cluster.


893095 Service <%s> with path <%s> is not available. Retrying...

Description:

The service is not available yet. prenet_start method of SUNW.HAStorage is still testing and waiting.

Solution:

Not user action is required.


894418 reservation warning(%s) - Found invalid key, preempting

Description:

The device fencing program has discovered an invalid scsi-3 key on the specified device and is removing it.

Solution:

This is an informational message, no user action is needed.


894711 Could not resolve '%s' in the name server. Exiting.

Description:

clexecd program was unable to start due to an error in registering itself with the low-level clustering software.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


894800 Dependent hosts are not up. Not starting BV servers on $HOSTNAME.

Description:

The hosts in the startup order on which the specifiedhosts depends havent started.

Solution:

Bring the resource group containing the specified host online if it isnot yet running. If the resource group is already onlinethe probe will take appropriate action.


895149 (%s) t_open: tli error: %s

Description:

Call to t_open() failed. The "t_open" man page describes possible error codes. udlmctl will exit.

Solution:

Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.


895159 clcomm: solaris xdoor dup failed: %s

Description:

A dup operation failed. The "dup" man page describes possible error codes.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


895821 INTERNAL ERROR: cannot get nodeid for node <%s>

Description:

The scha_control function is unable to obtain the node id number for one of the resource group's potential masters. This node will not be considered a candidate destination for the scha_control giveover.

Solution:

Try issuing an scstat(1M) -n command and see if it successfully reports status for all nodes. If not, then the cluster configuration data may be corrupted. If so, then there may be an internal logic error in the rgmd. In either case, please save copies of the /var/adm/messages files on all nodes, and report the problem to your authorized Sun service provider.


896275 CCR: Ignoring override field for table %s on joining node %s.

Description:

The override flag for a table indicates that the CCR should use this copy as the final version when the cluster is coming up. If the cluster already has a valid copy while the indicated node is joining the cluster, then the override flag on the joining node is ignored.

Solution:

This is an informational message, no user action is needed.


896441 Unknown scalable service method code: %d.

Description:

The method code given is not a method code that was expected.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


896532 Invalid variable name in Environment_file. Ignoring %s

Description:

HA-Sybase reads the Environment_file and exports the variables declared in the Environment file. Syntax for declaring the variables is : VARIABLE=VALUE Lines starting with ' Lines starting with 'export' are ignored. VARIABLE is expected to be a valid Korn shell variable that starts with alphabet or '_' and contains alphanumerics and '_'.

Solution:

Please check the syntax and correct the Environment_file


897348 %s: must be run in secure mode using -S flag

Description:

rpc.sccheckd should always be invoked in secure mode. If this message shows up, someone has modified configuration files that affects server startup.

Solution:

Reinstall cluster packages or contact your service provider.


898001 launch_fed_prog: getlocalhostname() failed for program <%s>

Description:

The ucmmd was unable to obtain the name of the local host. Launching of a method failed.

Solution:

Examine other syslog messages occurring at about the same time to see if the problem can be identified. Save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance in diagnosing the problem.


898738 Aborting node because pm_tick delay of %lld ms exceeds %lld ms

Description:

The system is unable to send heartbeats for a long time. (This is half of the minimum of timeout values of all the paths. If the timeout values for all the paths is 10 secs then this value is 5 secs.) There is probably heavy interrupt activity causing the clock thread to get delayed, which in turn causes irregular heartbeats. The node is aborted because it is considered to be in 'sick' condition and it is better to abort this node instead of causing other nodes (or the cluster) to go down.

Solution:

Check to see what is causing high interrupt activity and configure the system accordingly.


899278 Retry_count exceeded in Retry_interval

Description:

Fault monitor has detected problems in RDBMS server. Number of restarts through fault monitor exceed the count specified in 'Retry_count' parameter in 'Retry_interval'. Database server is unable to survive on this node. Switching over the resourge group to other node.

Solution:

Please check the RDBMS setup and server configuration.


899305 clexecd: Daemon exiting because child died.

Description:

Child process in the clexecd program is dead.

Solution:

If this message is seen when the node is shutting down, ignore the message. If thats not the case, the node will halt or reboot itself to prevent data corruption. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


899648 Failed to process the resource information.

Description:

A Sun cluster data service is unable to retrieve the resource property information. Low memory or API call failure might be the reasons.

Solution:

In case of low memory, the problem will probably cured by rebooting. If the problem reoccurs, you might need to increase swap space by configuring additional swap devices. Otherwise, if it is API call failure, check the syslog messages from other components.


899776 ERROR: scha_control() was called on resource group <%s>, resource <%s> before the RGM started

Description:

This message most likely indicates that a program called scha_control(1ha,3ha) before the RGM had started up. Normally, scha_control is called by a resource monitor to request failover or restart of a resource group. If the RGM had not yet started up on the cluster, no resources or resource monitors should have been running on any node. The scha_control call will fail with a SCHA_ERR_CLRECONF error.

Solution:

On the node where this message appeared, confirm that rgmd was not yet running (i.e., the cluster was just booting up) when this message was produced. Find out what program called scha_control. If it was a customer-supplied program, this most likely represents an incorrect program behavior which should be corrected. If there is no such customer-supplied program, or if the cluster was not just starting up when the message appeared, contact your authorized Sun service provider for assistance in diagnosing the problem.

Message IDs 900000–999999


900102 Failed to retrieve the resource type property %s: %s.

Description:

An API operation has failed while retrieving the resource type property. Low memory or API call failure might be the reasons.

Solution:

In case of low memory, the problem will probably cured by rebooting. If the problem reoccurs, you might need to increase swap space by configuring additional swap devices. Otherwise, if it is API call failure, check the syslog messages from other components.


900206 parameter '%s%s' must be an integer "%s". Using default value of %d.

Description:

Using a default value for a parameter.

Solution:

None.


900499 Error: low memory

Description:

The rpc.fed server was not able to allocate memory. The server may not be able to capture the output from methods it runs.

Solution:

Investigate if the host is running out of memory. If not save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


900501 pthread_sigmask: %s

Description:

Solution:


900675 cluster volume manager shared access mode enabled

Description:

Message indicating shared access availability of the volume manager.

Solution:

None.


900843 Retrying to retrieve the cluster information.

Description:

An update to cluster configuration occured while cluster properties were being retrieved

Solution:

Ignore the message.


900954 fatal: Unable to open CCR

Description:

The rgmd was unable to open the cluster configuration repository (CCR). The rgmd will produce a core file and will force the node to halt or reboot to avoid the possibility of data corruption.

Solution:

Save a copy of the /var/adm/messages files on all nodes, and of the rgmd core file. Contact your authorized Sun service provider for assistance in diagnosing the problem.


901030 clconf: Data length is more than max supported length in clconf_file_io

Description:

In reading configuration data through CCR FILE interface, found the data length is more than max supported length.

Solution:

Check the CCR configuration information.


902721 Switching over resource group using scha_control GIVEOVER

Description:

Fault monitor has detected problems in RDBMS server. Fault monitor has determined that RDBMS server cannot be restarted on this node. Attempt will be made to switchover the resource to any other node, if a healthy node is available.

Solution:

Check the cause of RDBMS failure.


903007 txmit_common: udp is null!

Description:

Can not transmit a message and communicate with udlmctl because the address to send to is null.

Solution:

None.


903317 Entry at position %d in property %s was invalid.

Description:

An invalid entry was found in the named property. The position index, which starts at 0 for the first element in the list, indicates which element in the property list was invalid.

Solution:

Make sure the property has a valid value.


903370 Command %s failed to run: %s.

Description:

HA-NFS attempted to run the specified command to perform some action which failed. The specific reason for failure is also provided.

Solution:

HA-NFS will take action to recover from this failure, if possible. If the failure persists and service does not recover, contact your service provider. If an immediate repair is desired, reboot the cluster node where this failure is occuring repeatedly.


903734 Failed to create lock directory %s: %s.

Description:

This network resource failed to create a directory in which to store lock files. These lock files are needed to serialize the running of the same callback method on the same adapter for multiple resources.

Solution:

This might be the result of a lack of system resources. Check whether the system is low in memory and take appropriate action. For specific error information, check the syslog message.


905023 clexecd: dup2 of stderr returned with errno %d while exec'ing (%s). Exiting.

Description:

clexecd program has encountered a failed dup2(2) system call. The error message indicates the error number for the failure.

Solution:

The clexecd program will exit and the node will be halted or rebooted to prevent data corruption. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


905591 Could not determine volume configuration daemon mode

Description:

Could not get information about the volume manager deamon mode.

Solution:

Check if the volume manager has been started up right.


906435 Encountered an error while validating configuration.

Description:

An error occurred during the validations of specified HAStoragePlus global device paths and/or FilesystemMountPoints.

Solution:

Investigate possible RGM, DSDL, DCS errors. Contact your authorized Sun service provider for assistance in diagnosing the problem.


906589 Error retrieving network address resource in resource group.

Description:

An error occured reading the indicated extension property.

Solution:

Check syslog messages for errors logged from other system modules. If error persists, please report the problem.


906838 reservation warning(%s) - do_scsi3_registerandignorekey() err or for disk %s, attempting do_scsi3_register()

Description:

The device fencing program has encountered errors while trying to access a device. Now trying to run do_scsi3_register() This is an informational message, no user action is needed

Solution:

This is an informational message, no user action is needed.


907960 scvxvmlg error - stat(%s) failed with errno %d

Description:

The program responsible for maintaining the VxVM namespace was unable to access the global device namespace. If configuration changes were recently made to VxVM diskgroups or volumes, this node may be unaware of those changes. Recently created volumes may be unaccessible from this node.

Solution:

Verify that the /global/.devices/node@N (N = this node's node number) is mounted globally and is accessible. If no configuration changes have been recently made to VxVM diskgroups or volumes and all volumes continue to be accessible from this node, then no further action is required. If changes have been made, the device namespace on this node can be updated to reflect those changes by executing '/usr/cluster/lib/dcs/scvxvmlg'. If the problem persists, contact your authorized Sun service provider to determine whether a workaround or patch is available.


908240 in libsecurity realloc failed

Description:

A server (rpc.pmfd, rpc.fed or rgmd) was not able to start, or a client was not able to make an rpc connection to the server, probably due to low memory. An error message is output to syslog.

Solution:

Investigate if the host is low on memory. If not, save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


908387 ucm_callback for step %d generated exception %d

Description:

ucmm callback for a step failed.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


908591 Failed to stop fault monitor.

Description:

An attempt was made to stop the fault monitor and it failed. There may be prior messages in syslog indicating specific problems.

Solution:

If there are prior messages in syslog indicating specific problems, these should be corrected. If that doesn't resolve the issue, the user can try the following. Use process monitor facility (pmfadm (1M)) with -L option to retrieve all the tags that are running on the server. Identify the tag name for the fault monitor of this resource. This can be easily identified as the tag ends in string ".mon" and contains the resource group name and the resource name. Then use pmfadm (1M) with -s option to stop the fault monitor. This problem may occur when the cluster is under load and Sun Cluster cannot stop the fault monitor within the timeout period specified. You may consider increasing the Monitor_Stop_timeout property. If the error still persists, then reboot the node.


908716 fatal: Aborting this node because method <%s> failed on resource <%s> and Failover_mode is set to HARD

Description:

A STOP method has failed or timed out on a resource, and the Failover_mode property of that resource is set to HARD.

Solution:

No action is required. This is normal behavior of the RGM. With Failover_mode set to HARD, the rgmd reboots the node to force the resource group offline so that it can be switched onto another node. Other syslog messages that occurred just before this one might indicate the cause of the STOP method failure.


909069 tcpmodopen: Could not allocate private data

Description:

Machine is out of memory.


909656 Unable to open /dev/kmem:%s

Description:

HA-NFS fault monitor attempt to open the device but failed. The specific cause of the failure is logged with the message. The /dev/kmem interface is used to read NFS activity counters from kernel.

Solution:

No action. HA-NFS fault monitor would ignore this error and try to open the device again later. Since it is unable to read NFS activity counters from kernel, HA-NFS would attempt to contact nfsd by means of a NULL RPC. A likely cause of this error is lack of resources. Attempt to free memory by terminating any programs which are using large amounts of memory and swap. If this error persists, reboot the node.


909737 Error loading dtd for %s

Description:

Solution:


909737 Error loading dtd for %s

Description:

Solution:


910546 Although there are no other potential masters, RGM is failing resource group <%s> off of node <%d> because there are other current masters.

Description:

A scha_control(1HA,3HA) GIVEOVER attempt succeeded, even though no candidate node was available to host the resource group, because the resource group was currently mastered by another node.

Solution:

No action required.


911176 Successfully started BV on %s

Description:

Just an Informational Message that the BV servers and daemons on the specified host have started.

Solution:

No action needed.


912352 Could not unplumb some IP addresses.

Description:

Some of the ip addresses managed by the LogicalHostname resource were not successfully brought offline on this node.

Solution:

Use the ifconfig command to make sure that the ip addresses are indeed absent. Check for any error message before this error message for a more precise reason for this error. Use scswitch command to move the resource group to a different node. If problem persists, reboot.


912696 The action to be taken as determined by scds_fm_action is failover. However the application is not being failed over because the failover_enabled extension property is set to false. Restarting the application instead.

Description:

Property failover_enabled is set to false. The probe is trying to restart application locally, instead of failover.

Solution:

This is an informational message, no user action is needed.


912866 Could not validate CCR tables; halting node

Description:

The rgmd was unable to check the validity of the CCR tables representing Resource Types and Resource Groups. The node will be halted to prevent further errors.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


913654 %s not specified on command line.

Description:

Required property not included in a scrgadm command.

Solution:

Reissue the scrgadm command with the required property and value.


914248 Listener status probe timed out after %s seconds.

Description:

An attempt to query the status of the Oracle listener using command 'lsnrctl status <listener_name>' did not complete in the time indicated, and was abandoned. HA-Oracle will attempt to kill the listener and then restart it.

Solution:

None, HA-Oracle will attempt to restart the listener. However, the cause of the listener hang should be investigated further. Examine the log file and syslog messages for additional information.


914519 Error when sending message to child %m

Description:

Error occurred when communicating with fault monitor child process. Child process will be stopped and restarted.

Solution:

If error persists, then disable the fault monitor and resport the problem.


914655 Restarting the resource %s.

Description:

The process monitoring facility tried to send a message to the fault monitor noting that the data service application died. It was unable to do so.

Solution:

Since some part (daemon) of the application has failed, it would be restarted. If fault monitor is not yet started, wait for it to be started by Sun Cluster framework. If fault monitor has been disabled, enable it using scswitch.


914866 Unable to complete some unshare commands.

Description:

HA-NFS postnet_stop method was unable to complete the unshare(1M) command for some of the paths specified in the dfstab file.

Solution:

The exact pathnames which failed to be unshared would have been logged in earlier messages. Run those unshare commands by hand. If problem persists, reboot the node.


915389 Failed to create socket: %s.

Description:

Failure in communication between fault monitor and process monitor facility.

Solution:

This is internal error. Save /var/adm/messages file and contact the Sun service provider.


917591 fatal: Resource type <%s> update failed with error <%d>; aborting node

Description:

Rgmd failed to read updated resource type from the CCR on this node.

Solution:

Save a copy of the /var/adm/messages files on all nodes, and of the rgmd core file. Contact your authorized Sun service provider for assistance in diagnosing the problem.


918018 load balancer for group '%s' released

Description:

This message indicates that the service group has been deleted.

Solution:

This is an informational message, no user action is needed.


918488 Validation failed. Invalid command line parameter %s %s.

Description:

Unable to process parameters passed to the call back methodspecified. This is a Sun Cluster HA for Sybase internal error.

Solution:

Report this problem to your authorized Sun service provider


919740 WARNING: error in translating address (%s) for nodeid %d

Description:

Could not get address for a node.

Solution:

Make sure the node is booted as part of a cluster.


919860 scvxvmlg warning - %s does not link to %s, changing it

Description:

The program responsible for maintaining the VxVM device namespace has discovered inconsistencies between the VxVM device namespace on this node and the VxVM configuration information stored in the cluster device configuration system. If configuration changes were made recently, then this message should reflect one of the configuration changes. If no changes were made recently or if this message does not correctly reflect a change that has been made, the VxVM device namespace on this node may be in an inconsistent state. VxVM volumes may be inaccessible from this node.

Solution:

If this message correctly reflects a configuration change to VxVM diskgroups then no action is required. If the change this message reflects is not correct, then the information stored in the device configuration system for each VxVM diskgroup should be examined for correctness. If the information in the device configuration system is accurate, then executing '/usr/cluster/lib/dcs/scvxvmlg' on this node should restore the device namespace. If the information stored in the device configuration system is not accurate, it must be updated by executing '/usr/cluster/bin/scconf -c -D name=diskgroup_name' for each VxVM diskgroup with inconsistent information.


920103 created %d threads to handle resource group switchback; desired number = %d

Description:

The rgmd was unable to create the desired number of threads upon starting up. This is not a fatal error, but it may cause RGM reconfigurations to take longer because it will limit the number of tasks that the rgmd can perform concurrently.

Solution:

Make sure that the hardware configuration meets documented minimum requirements. Examine other syslog messages on the same node to see if the cause of the problem can be determined.


920736 Unknown transport type: %s

Description:

The transport type used is not known to Solaris Clustering


922085 INTERNAL ERROR CMM: Memory allocation error.

Description:

The CMM failed to allocate memory during initialization.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


922330 getservbyname() failed : %s.

Description:

Entries for required services are missing.

Solution:

Verify the sources for the services specified in the /etc/nsswitch.conf file. These should have the entry for the required service.


922363 resource %s status msg on node %s change to <%s>

Description:

This is a notification from the rgmd that a resource's fault monitor status message has changed.

Solution:

This is an informational message, no user action is needed.


922870 tag %s: unable to kill process with SIGKILL

Description:

The rpc.fed server is not able to kill the process with a SIGKILL. This means the process is stuck in the kernel.

Solution:

Save the syslog messages file. Examine other syslog messages occurring around the same time on the same node, to see if the cause of the problem can be identified.


923184 CMM: Scrub failed for quorum device %s.

Description:

The scrub operation for the specified quorum device failed, due to which this quorum device will not be added to the cluster.

Solution:

There may be other related messages on this node that may indicate the cause of this problem. Refer to the disk repair section of the administration guide for resolving this problem. After the problem has been resolved, retry adding the quorum device.


923618 Prog <%s>: unknown command.

Description:

An internal error in ucmmd has prevented it from successfully executing a program.

Solution:

Save a copy of /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


923712 CCR: Table %s on joining node %s has the same version but different checksum as the copy in the current membership. The table on the joining node will be replaced by the one in the current membership.

Description:

The indicated table on the joining node has the same version but different contents as the one in the current membership. It will be replaced by the one in the current membership.

Solution:

This is an informational message, no user action is needed.


925260 Unlock failed: %s.


925379 resource group <%s> in illegal state <%s>, will not run %s on resource <%s>

Description:

While creating or deleting a resource, the rgmd discovered the containing resource group to be in an unexpected state on the local node. As a result, the rgmd did not run the INIT or FINI method (as indicated in the message) on that resource on the local node. This should not occur, and may indicate an internal logic error in the rgmd.

Solution:

The error is non-fatal, but it may prevent the indicated resource from functioning correctly on the local node. Try deleting the resource, and if appropriate, re-creating it. If those actions succeed, then the problem was probably transitory. Since this problem may indicate an internal logic error in the rgmd, please save a copy of the /var/adm/messages files on all nodes, the output of an scstat -g command, and the output of a scrgadm -pvv command. Report the problem to your authorized Sun service provider.


925953 reservation error(%s) - do_scsi3_register() error for disk %s

Description:

The device fencing program has encountered errors while trying to access a device. All retry attempts have failed.

Solution:

For the user action required by this message, see the user action for message 192619.


926099 char *fmt


926201 RGM aborting

Description:

A fatal error has occurred in the rgmd. The rgmd will produce a core file and will force the node to halt or reboot to avoid the possibility of data corruption.

Solution:

Save a copy of the /var/adm/messages files on all nodes, and of the rgmd core file. Contact your authorized Sun service provider for assistance in diagnosing the problem.


926550 scha_cluster_get failed for (%s) with %d

Description:

Call to get cluster information failed. The second part of the message gives the error code.

Solution:

The calling program should handle this error. If it is not recoverable, it will exit.


926749 Resource group nodelist is empty.

Description:

Empty value was specified for the nodelist property of the resource group.

Solution:

Any of the following situations might have occured. Different user action is required for these different scenarios. 1) If a new resource is created or updated, check the value of the nodelist property. If it is empty or not valid, then provide valid value using scrgadm(1M) command. 2) For all other cases, treat it as an Internal error. Contact your authorized Sun service provider.


927042 Validation failed. SYBASE backup server startup file RUN_%s not found SYBASE=%s.

Description:

Backup server was specified in the extension property Backup_Server_Name. However, Backup server startup file was not found. Backup server startup file is expected to be: $SYBASE/$SYBASE_ASE/install/RUN_<Backup_Server_Name>

Solution:

Check the Backup server name specified in the Backup_Server_Name property. Verify that SYBASE and SYBASE_ASE environment variables are set property in the Environment_file. Verify that RUN_<Backup_Server_Name> file exists.


927753 Fault monitor does not exist or is not executable

Description:

Fault monitor program specified in support file is not executable or does not exist. Recheck your installation.

Solution:

Please report this problem.


927846 fatal: Received unexpected result <%d> from rpc.fed, aborting node

Description:

A serious error has occurred in the communication between rgmd and rpc.fed while attempting to execute a VALIDATE method. The rgmd will produce a core file and will force the node to halt or reboot to avoid the possibility of data corruption.

Solution:

Save a copy of the /var/adm/messages files on all nodes, and of the rgmd core file. Contact your authorized Sun service provider for assistance in diagnosing the problem.


928235 Validation failed. Adaptive_Server_Log_File %s not found.

Description:

File specified in the Adaptive_Server_Log_File extension property was not found. The Adaptive_Server_Log_File is used by the fault monitor for monitoring the server.

Solution:

Please check that file specified in the Adaptive_Server_Log_File extension property is accessible on all the nodes.


928382 CCR: Failed to read table %s on node %s.

Description:

The CCR failed to read the indicated table on the indicated node. The CCR will attempt to recover this table from other nodes in the cluster.

Solution:

This is an informational message, no user action is needed.


928455 clcomm: Couldn't write to routing socket: %d

Description:

The system prepares IP communications across the private interconnect. A write operation to the routing socket failed.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


929100 No permission for others to execute %s.

Description:

The specified path does not have the correct permissions as expected by a program.

Solution:

Set the permissions for the file so that it is readable and executable by others (world).


929252 Failed to start HA-NFS system fault monitor.

Description:

Process monitor facility has failed to start the HA-NFS system fault monitor.

Solution:

Check whether the system is low in memory or the process table is full and correct these problems. If the error persists, use scswitch to switch the resource group to another node.


929712 Share path %s: file system %s is not global.

Description:

The specified share path exists on a file system which is neither mounted via /etc/vfstab nor is a global file system.

Solution:

Share paths in HA-NFS must satisfy this requirement.


930059 %s: %s.

Description:

HA-SAP failed to access to a file. The file in question is specified with the first '%s'. The reason it failed is provided with the second '%s'.

Solution:

Check and make sure the file is accessable via the path list.


930851 ERROR: process_resource: resource <%s> is pending_fini but no FINI method is registered

Description:

A non-fatal internal error has occurred in the rgmd state machine.

Solution:

Since this problem might indicate an internal logic error in the rgmd, please save a copy of the /var/adm/messages files on all nodes, the output of an scstat -g command, and the output of a scrgadm -pvv command. Report the problem to your authorized Sun service provider.


931677 Could not reset SCSI buses on CMM reconfiguration. Could not find clexecd in nameserver.

Description:

An error occurred when the SC 3.0 software was in the process of resetting SCSI buses with shared nodes that are down.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


932126 CCR: Quorum regained.

Description:

The cluster lost quorum at sometime after the cluster came up, and the quorum is regained now.

Solution:

This is an informational message, no user action is needed.


933073 INTERNAL ERROR: unable to obtain static node membership of the cluster; continuing

Description:

This is a non-fatal internal error. The rgmd is unable to get the static node membership. A complete set of all possible nodes will be used instead. This may cause some spurious state transitions for non-existent nodes to be syslogged on a new president or when a Resource or a Resource Group is created.

Solution:

Since this problem might indicate an internal logic error in the rgmd, please save a copy of the /var/adm/messages files on all nodes, the output of an scstat -g command, and the output of a scrgadm -pvv command. Report the problem to your authorized Sun service provider.


934249 Invalid value for property %s: %d.

Description:

An invalid value may have been specified for the property.

Solution:

Please set correct value for the property and retry the operation.


935576 $HOSTNAME is not configured for BV processes.

Description:

The specified host is not configured for BV processes.

Solution:

Configure BV processes to run on this host or create the resource group properly with the right Networkresource and BV resource in the RG.


936306 svc_setschedprio: Could not setup RT (real time) scheduling parameters: %s

Description:

The server was not able to set the scheduling mode parameters, and the system error is shown. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


937669 CCR: Failed to update table %s.

Description:

The CCR data server failed to update the indicated table.

Solution:

There may be other related messages on this node, which may help diagnose the problem. If the root file system is full on the node, then free up some space by removing unnecessary files. If the root disk on the afflicted node has failed, then it needs to be replaced.


938163 %s %s server startup encountered errors, errno = %d.

Description:

TCP server to accept connections on the private interconnect could not be started.


938189 rpc.fed: program not registered or server not started

Description:

The rpc.fed daemon was not correctly initialized, or has died. This has caused a step invocation failure, and may cause the node to reboot.

Solution:

Check that the Solaris Clustering software has been installed correctly. Use ps(1) to check if rpc.fed is running. If the installation is correct, the reboot should restart all the required daemons including rpc.fed.


938618 Couldn't create deleted subdir %s: error (%d)

Description:

While mounting this file system, PXFS was unable to create some directories that it reserves for internal use.

Solution:

If the error is 28(ENOSPC), then mount this FS non-globally, make some space, and then mount it globally. If there is some other error, and you are unable to correct it, contact your authorized Sun service provider to determine whether a workaround or patch is available.


938836 invalid value for parameter '%sfailfast': "%s". Using default value of 'panic'

Description:

/opt/SUNWudlm/etc/udlm.conf did not have a valid entry for failfast mode. Default mode of 'panic' will be used.

Solution:

None.


939374 CCR: Failed to access cluster repository during synchronization. ABORT node.

Description:

This node failed to access its cluster repository when it first came up in cluster mode and tried to synchronize its repository with other nodes in the cluster.

Solution:

This is usually caused by an unrecoverable failure such as disk failure. There may be other related messages on this node, which may help diagnose the problem. If the root disk on the afflicted node has failed, then it needs to be replaced. If the root disk is full on this node, boot the node into non-cluster mode and free up some space by removing unnecessary files.


940685 Configuration file %s missing for NetBackup.

Description:

The configuration file for NetBackup is missing or does not have correct permissions.

Solution:

Check whether the NetBackup configuration file bp.conf, or a link to it exists under /usr/openv/netbackup, and that the file has correct permissions.


941267 Cannot determine command passed in: <%s>.

Description:

An invalid pathname, displayed within the angle brackets, was passed to a libdsdev routine such as scds_timerun or scds_pmf_start. This could be the result of mis-configuring the name of a START or MONITOR_START method or other property, or a programming error made by the resource type developer.

Solution:

Supply a valid pathname to a regular, executable file.


941367 open failed: %s

Description:

Failed to open /dev/console. The "open" man page describes possible error codes.

Solution:

None. ucmmd will exit.


941416 One or more of the SUNW.HAStoragePlus resources that this resource depends on is not online anywhere.

Description:

It is an invalid configuration to create an application resource that depends on one or more SUNW.HAStoragePlus resource(s) that are not online on any node.

Solution:

Bring the SUNW.HAStoragePlus resource(s) online before creating the application resource that depend on them and then try the command again.


941693 "%s" Failed to stay up.

Description:

The tag shown, being run by the rpc.pmfd server, has exited. Either the user has decided to stop monitoring this process, or the process exceeded the number of retries. An error message is output to syslog.

Solution:

This message is informational; no user action is needed.


943168 pmf_monitor_suspend: pmf_add_triggers: %s

Description:

The rpc.pmfd server was not able to resume the monitoring of a process, and the monitoring of this process has been aborted. An error message is output to syslog.

Solution:

Save the syslog messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


944068 clcomm: validate_policy: invalid relationship moderate %d high %d pool %d

Description:

The system checks the proposed flow control policy parameters at system startup and when processing a change request. The moderate server thread level cannot be higher than the high server thread level.

Solution:

No user action required.


944121 Incorrect permissions set for %s

Description:

This file does not have the expected default execute permissions.

Solution:

Reset the permissions to allow execute permissions using the chmod command.


945325 Livecache instance name is not defined in script lccluster.

Description:

The livecache instance name (in upper case) is not defined in the script 'lccluster'.

Solution:

Make sure livecache instance name is defined in script lccluster. See the instructions in script file lccluster for details.


946660 Failed to create sap state file %s:%s Might put sap resource in stop-failed state.

Description:

If SAP is brought up outside the control of Sun Cluster, HA-SAP will create the state file to signal the stop method not to try to stop sap via Sun Cluster. Now if SAP was brought up outside of Sun Cluster, and the state file creation failed, then the SAP resource might end in the stop-failed state when Sun Cluster tries to stop SAP.

Solution:

This is an internal error. No user action needed. Save the /var/adm/messages from all nodes. Contact your authorized Sun service provider.


946873 The Host $i is not yet up.

Description:

The host specified is not running.

Solution:

Bring the resource group containing the specified host online if it isnot yet running. If the resource group is already onlinethe probe will take appropriate action.


947007 Error initializing the cluster version manager (error %d).

Description:

This message can occur when the system is booting if incompatible versions of cluster software are installed.

Solution:

Verify that any recent software installations completed without errors and that the installed packages or patches are compatible with the rest of the installed software. Also, contact your authorized Sun service provider to determine whether a workaround or patch is available.


947401 reservation error(%s) - Unable to open device %s, errno %d

Description:

The device fencing program has encountered errors while trying to access a device. All retry attempts have failed.

Solution:

This may be indicative of a hardware problem, which should be resolved as soon as possible. Once the problem has been resolved, the following actions may be necessary: If the message specifies the 'node_join' transition, then this node may be unable to access the specified device. If the failure occurred during the 'release_shared_scsi2' transition, then a node which was joining the cluster may be unable to access the device. In either case, access can be reacquired by executing '/usr/cluster/lib/sc/run_reserve -c node_join' on all cluster nodes. If the failure occurred during the 'make_primary' transition, then a device group may have failed to start on this node. If the device group was started on another node, it may be moved to this node with the scswitch command. If the device group was not started, it may be started with the scswitch command. If the failure occurred during the 'primary_to_secondary' transition, then the shutdown or switchover of a device group may have failed. If so, the desired action may be retried.


948847 ucm_callback for start_trans generated exception %d

Description:

ucmm callback for start transition failed. Step may have timedout.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


949148 "%s" requeued

Description:

The tag shown has exited and was restarted by the rpc.pmfd server. An error message is output to syslog.

Solution:

This message is informational; no user action is needed.


949565 reservation error(%s) - do_scsi2_tkown() error for disk %s

Description:

The device fencing program has encountered errors while trying to access a device. All retry attempts have failed.

Solution:

The action which failed is a scsi-2 ioctl. These can fail if there are scsi-3 keys on the disk. To remove invalid scsi-3 keys from a device, use 'scdidadm -R' to repair the disk (see scdidadm man page for details). If there were no scsi-3 keys present on the device, then this error is indicative of a hardware problem, which should be resolved as soon as possible. Once the problem has been resolved, the following actions may be necessary: If the message specifies the 'node_join' transition, then this node may be unable to access the specified device. If the failure occurred during the 'release_shared_scsi2' transition, then a node which was joining the cluster may be unable to access the device. In either case, access can be reacquired by executing '/usr/cluster/lib/sc/run_reserve -c node_join' on all cluster nodes. If the failure occurred during the 'make_primary' transition, then a device group may have failed to start on this node. If the device group was started on another node, it may be moved to this node with the scswitch command. If the device group was not started, it may be started with the scswitch command. If the failure occurred during the 'primary_to_secondary' transition, then the shutdown or switchover of a device group may have failed. If so, the desired action may be retried.


949937 Out of memory.

Description:

A process has failed to allocate new memory, most likely because the system has run out of swap space.

Solution:

The problem will probably be cured by rebooting. If the problem reoccurs, you might need to increase swap space by configuring additional swap devices. See swap(1M) for more information.


949937 Out of memory.

Description:

The data service has failed to allocate memory, most likely because the system has run out of swap space.

Solution:

The problem will probably cured by rebooting. If the problem reoccurs, you might need to increase swap space by configuring additional swap devices. See swap(1M) for more information.


950747 resource %s monitor disabled.

Description:

This is a notification from the rgmd that the operator has disabled monitoring on a resource. This may be used by system monitoring tools.

Solution:

This is an informational message, no user action is needed.


950760 reservation fatal error(%s) - get_resv_lock() error

Description:

The device fencing program has suffered an internal error.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available. Copies of /var/adm/messages from all nodes should be provided for diagnosis. It may be possible to retry the failed operation, depending on the nature of the error. If the message specifies the 'node_join' transition, then this node may be unable to access shared devices. If the failure occurred during the 'release_shared_scsi2' transition, then a node which was joining the cluster may be unable to access shared devices. In either case, it may be possible to reacquire access to shared devices by executing '/usr/cluster/lib/sc/run_reserve -c node_join' on all cluster nodes. If the failure occurred during the 'make_primary' transition, then a device group has failed to start on this node. If another node was available to host the device group, then it should have been started on that node. If desired, it may be possible to switch the device group to this node with the scswitch command. If no other node was available, then the device group will not have been started. The scswitch command may be used to retry the attempt to start the device group. If the failure occurred during the 'primary_to_secondary' transition, then the shutdown or switchover of a device group has failed. The desired action may be retried.


951501 CCR: Could not initialize CCR data server.

Description:

The CCR data server could not initialize on this node. This usually happens when the CCR is unable to read its metadata entries on this node. There is a CCR data server per cluster node.

Solution:

There may be other related messages on this node, which may help diagnose this problem. If the root disk failed, it needs to be replaced. If there was cluster repository corruption, then the cluster repository needs to be restored from backup or other nodes in the cluster. Boot the offending node in -x mode to restore the repository. The cluster repository is located at /etc/cluster/ccr/.


951520 Validation failed. SYBASE ASE runserver file RUN_%s not foundSYBASE=%s.

Description:

Sybase Adaptive Server is started by specifying the AdaptiveServer 'runserver' file named RUN_<Server Name> located under$SYBASE/$SYBASE_ASE/install. This file is missing.

Solution:

Verify that the Sybase installation includes the 'runserver' fileand that permissions are set correctly on the file. The file should reside in the $SYBASE/$SYBASE_ASE/install directory.


951634 INTERNAL ERROR CMM: clconf_get_quorum_table() returned error %d.

Description:

The node encountered an internal error during initialization of the quorum subsystem object.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


951733 Incorrect usage: %s.

Description:

The usage of the program was incorrect for the reason given.

Solution:

Use the correct syntax for the program.


952006 received signal %d: exiting

Description:

Solution:


952237 Method <%s>: unknown command.

Description:

An internal error has occurred in the interface between the rgmd and fed daemons. This in turn will cause a method invocation to fail. This should not occur and may indicate an internal logic error in the rgmd.

Solution:

Look for other syslog error messages on the same node. Save a copy of the /var/adm/messages files on all nodes, and report the problem to your authorized Sun service provider.


952465 HA: exception adding secondary

Description:

A failure occurred while attempting to add a secondary provider for an HA service.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


952857 Confdir_list is not defined correctly in script lccluster.

Description:

Confdir_list path is not defined correctly in the script 'lccluster'.

Solution:

Make sure the path for Confdir_list is defined in the script lccluster using parameter 'CONFDIR_LIST'. The value should be defined inside the double quotes, and it is the same as what is defined for extension property 'Confdir_list'.


953642 Server is not running. Calling shutdown abort to clear shared memory (if any)

Description:

Informational message. Oracle server is not running. However if Oracle processes are aborted without clearing shared memory, it can cause problems when starting Oracle server. Clearing leftover shared memory if any.

Solution:

None


954497 clcomm: Unable to find %s in name server

Description:

The specified entity is unknown to the name server.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


955930 Attempt to connect from addr %s port %d

Description:

There was a connection from the named IP address and port number (> 1024) which means that a non-priviledged process is trying to talk to the PNM daemon.

Solution:

This message is informational; no user action is needed. However, it would be a good idea to see which non-priviledged process is trying to talk to the PNM daemon and why?


956501 Issuing a failover request.

Description:

This message indicates that the function is about to make a failover request to the RGM. If the request fails, refer to the syslog messages that appear after this message.

Solution:

This is an informational message, no user action is required.


957086 Prog <%s> failed to execute step <%s> - error=<%d>

Description:

ucmmd failed to execute a step.

Solution:

Examine other syslog messages occurring at about the same time to see if the problem can be identified and if it recurs. Save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance.


957535 Smooth_shutdown flag is not set to TRUE. The WebLogic Server will be shutdown using sigkill.

Description:

This is a information message. The Smooth_shutdown flag is not set to true and hence the WLS will be stopped using SIGKILL.

Solution:

None. If the smooth shutdown has to be enabled then set the Smooth_shutdown extension property to TRUE. To enable smooth shutdown the username and pasword that have to be passed to the "java weblogic.Admin .." has to be set in the start script. Refer to your Admin guide for details.


958832 INTERNAL ERROR: monitoring is enabled, but MONITOR_STOP method is not registered for resource <%s>

Description:

A non-fatal internal error has occurred in the rgmd state machine.

Solution:

Since this problem might indicate an internal logic error in the rgmd, please save a copy of the /var/adm/messages files on all nodes, the output of an scstat -g command, and the output of a scrgadm -pvv command. Report the problem to your authorized Sun service provider.


958888 clcomm: Failed to allocate simple xdoor client %d

Description:

The system could not allocate a simple xdoor client. This can happen when the xdoor number is already in use. This message is only possible on debug systems.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


959384 Possible syntax error in hosts entry in %s.

Description:

Validation callback method has failed to validate the hostname list. There may be syntax error in the nsswitch.conf file.

Solution:

Check for the following syntax rules in the nsswitch.conf file. 1) Check if the lookup order for "hosts" has "files". 2) "cluster" is the only entry that can come before "files". 3) Everything in between '[' and ']' is ignored. 4) It is illegal to have any leading whitespace character at the beginning of the line; these lines are skipped. Correct the syntax in the nsswitch.conf file and try again.


959610 Property %s should have only one value.

Description:

A multi-valued (comma-separated) list was provided to the scrgadm command for the property, while the implementation supports only one value for this property.

Solution:

Specify a single value for the property on the scrgadm command.


960308 clcomm: Pathend %p: remove_path called twice

Description:

The system maintains state information about a path. The remove_path operation is not allowed in this state.

Solution:

No user action is required.


960344 ERROR: process_resource: resource <%s> is pending_init but no INIT method is registered

Description:

A non-fatal internal error has occurred in the rgmd state machine.

Solution:

Since this problem might indicate an internal logic error in the rgmd, please save a copy of the /var/adm/messages files on all nodes, the output of an scstat -g command, and the output of a scrgadm -pvv command. Report the problem to your authorized Sun service provider.


960862 (%s) sigaction failed: %s (UNIX errno %d)

Description:

The udlm has failed to initialize signal handlers by a call to sigaction(2). The error message indicates the reason for the failure.

Solution:

Save a copy of the /var/adm/messages files on all nodes, and of the rgmd core file. Contact your authorized Sun service provider for assistance in diagnosing the problem.


960932 Switchover (%s) error: failed to fsck disk

Description:

The file system specified in the message could not be hosted on the node the message came from because an fsck on the file system revealed errors.

Solution:

Unmount the PXFS file system (if mounted), fsck the device, and then mount the PXFS file system again.


961551 Signal %d terminated the scalable service configuration process.

Description:

An unexpected signal caused the termination of the program that configures the networking components for a scalable resource. This premature termination will cause the scalable service configuration to be aborted for this resource.

Solution:

Save a copy of the /var/adm/messages files on all nodes. If a core file was generated, submit the core to your service provider. Contact your authorized Sun service provider for assistance in diagnosing the problem.


961768 Failed to unshare %s.


962746 Usage: %s [-c|-u] -R <resource-name> -T <type-name> -G <group-name> [-r sys_def_prop=values ...] [-x ext_prop=values ...].

Description:

Incorrect arguments are passed to the callback methods.

Solution:

This is an internal error. Contact your authorized Sun service provider for assistance in diagnosing and correcting the problem.


963465 fatal: rpc_control() failed to set automatic MT mode; aborting node

Description:

The rgmd failed in a call to rpc_control(3N). This error should never occur. If it did, it would cause the failure of subsequent invocations of scha_cmds(1HA) and scha_calls(3HA). This would most likely lead to resource method failures and prevent RGM reconfigurations from occurring. The rgmd will produce a core file and will force the node to halt or reboot.

Solution:

Examine other syslog messages occurring at about the same time to see if the source of the problem can be identified. Save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance in diagnosing the problem. Reboot the node to restart the clustering daemons.


963755 lkcm_cfg: caller is not registered

Description:

udlm is not registered with ucmm.

Solution:

This is an internal error. Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.


964072 Unable to resolve %s.

Description:

The data service has failed to resolve the host information.

Solution:

If the logical host and shared address entries are specified in the /etc/inet/hosts file, check that these entries are correct. If this is not the reason, then check the health of the name server. For more error information, check the syslog messages.


964083 t_open (open_cmd_port) failed

Description:

Call to t_open() failed. The "t_open" man page describes possible error codes. ucmmd will exit and the node will abort.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


964399 udlm seq no (%d) does not match library's (%d).

Description:

Mismatch in sequence numbers between udlm and the library code is causing an abort.

Solution:

This is an internal error. Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.


964521 Failed to retrieve the resource handle: %s.

Description:

An API operation on the resource has failed.

Solution:

For the resource name, check the syslog tag. For more details, check the syslog messages from other components. If the error persists, reboot the node.


965873 CMM: Node %s (nodeid = %d) with votecount = %d added.

Description:

The specified node with the specified votecount has been added to the cluster.

Solution:

This is an informational message, no user action is needed.


965935 Node %d: weight %d


966245 %d entries found in property %s. For a secure Netscape Directory Server instance %s should have one or two entries.

Description:

Since a secure Netscape Directory Server instance can listen on only one or two ports, the list property should have either one or two entries. A different number of entries was found.

Solution:

Change the number of entries to be either one or two.


966416 This list element in System property %s has an invalid protocol: %s.

Description:

The system property that was named does not have a valid protocol.

Solution:

Change the value of the property to use a valid protocol.


966670 did discovered faulty path, ignoring: %s

Description:

scdidadm has discovered a suspect logical path under /dev/rdsk. It will not add it to subpaths for a given instance.

Solution:

Check to see that the symbolic links under /dev/rdsk are correct.


966842 in libsecurity unknown security flag %d

Description:

This is an internal error which shouldn't happen. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


967050 Validation failed. Listener binaries not found ORACLE_HOME=%s

Description:

Oracle listener binaries not found under ORACLE_HOME. ORACLE_HOME specified for the resource is indicated in the message. HA-Oracle will not be able to manage Oracle listener if ORACLE_HOME is incorrect.

Solution:

Specify correct ORACLE_HOME when creating resource. If resource is already created, please update resource property 'ORACLE_HOME'.


967080 This resource depends on a HAStoragePlus resouce that is not online on this node. Ignoring validation errors.

Description:

The resource depends on a HAStoragePlus resource. Some of the files required for validation checks are not accessible from this node, because HAStoragePlus resource in not online on this node. Validations will be performed on the node that has HAStoragePlus resource online. Validation errors are being ignored on this node by this callback method.

Solution:

Check the validation errors logged in the syslog messages. Please verify that these errors are not configuration errors.


967372 Directory %s is not readable.


967970 Modification of resource <%s> failed because none of the nodes on which VALIDATE would have run are currently up

Description:

In order to change the properties of a resource whose type has a registered VALIDATE method, the rgmd must be able to run VALIDATE on at least one node. However, all of the candidate nodes are down. "Candidate nodes" are either members of the resource group's Nodelist or members of the resource type's Installed_nodes list, depending on the setting of the resource's Init_nodes property.

Solution:

Boot one of the resource group's potential masters and retry the resource change operation.


968426 arguments to bv_utils are $*

Description:

Just a debug message.

Solution:

No action needed.


968557 Could not unplumb any ip addresses.

Description:

Failed to unplumb any ip addresses. The resource cannot be brought offline. Node will be rebooted by Sun cluster.

Solution:

Check the syslog messages from other components for possible root cause. Save a copy of /var/adm/messages and contact Sun service provider for assistance in diagnosing and correcting the problem.


968853 scha_resource_get error (%d) when reading system property %s

Description:

Error occurred in API call scha_resource_get.

Solution:

Check syslog messages for errors logged from other system modules. Stop and start fault monitor. If error persists then disable fault monitor and report the problem.


969008 t_alloc (open_cmd_port-T_ADDR) %d

Description:

Call to t_alloc() failed. The "t_alloc" man page describes possible error codes. ucmmd will exit and the node will abort.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


969827 Failover attempt has failed.

Description:

The failover attempt of the resource is rejected or encountered an error.

Solution:

For more detailed error message, check the syslog messages. Check whether the Pingpong_interval has appropriate value. If not, adjust it using scrgadm(1M). Otherwise, use scswitch to switch the resource group to a healthy node.


970018 Probe for %s returned error.

Description:

Probe for the specified service returned error.


970912 execve: %s

Description:

The rpc.pmfd server was not able to exec a new process, possibly due to bad arguments. The message contains the system error. The server does not perform the action requested by the client, and an error message is output to syslog.

Solution:

Investigate that the file path to be executed exists. If all looks correct, save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


971233 Property %s is not set.

Description:

The property has not been set by the user and must be.

Solution:

Reissue the scrgadm command with the required property and value.


971412 Error in getting global service name for path <%s>

Description:

The path can not be mapped to a valid service name.

Solution:

Check the path passed into extension property "ServicePaths" of SUNW.HAStorage type resource.


972580 CCR: Highest epoch is < 0, highest_epoch = %d.

Description:

The epoch indicates the number of times a cluster has come up. It should not be less than 0. It could happen due to corruption in the cluster repository.

Solution:

Boot the cluster in -x mode to restore the cluster repository on all the members of the cluster from backup. The cluster repository is located at /etc/cluster/ccr/.


972610 fork: %s

Description:

The rgmd, rpc.pmfd or rpc.fed daemon was not able to fork a process, possibly due to low swap space. The message contains the system error. This can happen while the daemon is starting up (during the node boot process), or when executing a client call. If it happens when starting up, the daemon does not come up. If it happens during a client call, the server does not perform the action requested by the client.

Solution:

Investigate if the machine is running out of swap space. If this is not the case, save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


972716 Failed to stop the application with SIGKILL. Returning with failure from stop method.

Description:

The stop method failed to stop the application with SIGKILL.

Solution:

Use pmfadm(1M) with the -L option to retrieve all the tags that are running on the server. Identify the tag name for the application in this resource. This can be easily identified as the tag ends in the string ".svc" and contains the resource group name and the resource name. Then use pmfadm(1M) with the -s option to stop the application. If the error still persists, then reboot the node.


972908 Unable to get the name of the local cluster node: %s.

Description:

An internal error occurred while attempting to obtain the local cluster nodename.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


973615 Node %s: weight %d

Description:

The load balancer set the specified weight for the specified node.

Solution:

This is an informational message, no user action is needed.


973933 resource %s added.

Description:

This is a notification from the rgmd that the operator has created a new resource. This may be used by system monitoring tools.

Solution:

This is an informational message, no user action is needed.


974106 lkcm_parm: caller is not registered

Description:

udlm is not registered with ucmm.

Solution:

This is an internal error. Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.


974129 Cannot stat %s: %s.

Description:

The stat(2) system call failed on the specified pathname, which was passed to a libdsdev routine such as scds_timerun or scds_pmf_start. The reason for the failure is stated in the message. The error could be the result of 1) mis-configuring the name of a START or MONITOR_START method or other property, 2) a programming error made by the resource type developer, or 3) a problem with the specified pathname in the file system itself.

Solution:

Ensure that the pathname refers to a regular, executable file.


974664 HA: no valid secondary provider in rmm - aborting

Description:

This node joined an existing cluster. Then all of the other nodes in the cluster died before the HA framework components on this node could be properly initialized.

Solution:

This node must be rebooted.


976495 fork failed: %s

Description:

Failed to run the "fork" command. The "fork" man page describes possible error codes.

Solution:

Some system resource has been exceeded. Install more memory, increase swap space or reduce peak memory consumption.


976914 fctl: %s

Description:

Solution:


977371 Backup server terminated.

Description:

Graceful shutdown did not succeed. Backup server processes were killed in STOP method. It is likely that adaptive server terminated prior to shutdown of backup server.

Solution:

Please check the permissions of file specified in the STOP_FILE extension property. File should be executable by the Sybase owner and root user.


978125 in libsecurity setnetconfig failed when initializing the server: %s - %s

Description:

A server (rpc.pmfd, rpc.fed or rgmd) was not able to start because it could not establish a rpc connection for the network specified. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


978534 %s: lookup failed.

Description:

Could not get the hostname for a node. This could also be because the node is not booted as part of a cluster.

Solution:

Make sure the node is booted as part of a cluster.


978829 t_bind, did not bind to desired addr

Description:

Call to t_bind() failed. The "t_bind" man page describes possible error codes. udlm will exit and the node will abort.

Solution:

This is an internal error. Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.


979343 Error: duplicate prog <%s> launched step <%s>

Description:

Due to an internal error, uccmd has attempted to launch the same step by duplicate programs. ucmmd will reject the second program and treat it as a step failure.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


979803 CMM: Node being shut down.

Description:

This node is being shut down.

Solution:

This is an informational message, no user action is needed.


980307 reservation fatal error(%s) - Illegal command

Description:

The device fencing program has suffered an internal error.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available. Copies of /var/adm/messages from all nodes should be provided for diagnosis. It may be possible to retry the failed operation, depending on the nature of the error. If the message specifies the 'node_join' transition, then this node may be unable to access shared devices. If the failure occurred during the 'release_shared_scsi2' transition, then a node which was joining the cluster may be unable to access shared devices. In either case, it may be possible to reacquire access to shared devices by executing '/usr/cluster/lib/sc/run_reserve -c node_join' on all cluster nodes. If the failure occurred during the 'make_primary' transition, then a device group has failed to start on this node. If another node was available to host the device group, then it should have been started on that node. If desired, it may be possible to switch the device group to this node with the scswitch command. If no other node was available, then the device group will not have been started. The scswitch command may be used to retry the attempt to start the device group. If the failure occurred during the 'primary_to_secondary' transition, then the shutdown or switchover of a device group has failed. The desired action may be retried.


980425 Aborting startup: could not determine whether failover of NFS resource groups is in progress.

Description:

Startup of an NFS resource was aborted because it was not possible to determine if failover of any NFS resource groups is in progress.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


980477 LogicalHostname online.

Description:

The status of the logicalhost resource is online.

Solution:

This is informational message. No user action required.


980681 clconf: CSR removal failed

Description:

While executing task in clconf and modifying the state of proxy, received failure from trying to remove CSR.

Solution:

This is informational message. No user action required.


980942 CMM: Cluster doesn't have operational quorum yet; waiting for quorum.

Description:

Not enough nodes are operational to obtain a majority quorum; the cluster is waiting for more nodes before starting.

Solution:

If nodes are booting, wait for them to finish booting and join the cluster. Boot nodes that are down.


981739 CCR: Updating invalid table %s.

Description:

This joining node carries a valid copy of the indicated table with override flag set while the current cluster membership doesn't have a valid copy of this table. This node will update its copy of the indicated table to other nodes in the cluster.

Solution:

This is an informational message, no user action is needed.


981931 INTERNAL ERROR: postpone_start_r: meth type <%d>

Description:

A non-fatal internal error has occurred in the rgmd state machine.

Solution:

Since this problem might indicate an internal logic error in the rgmd, please save a copy of the /var/adm/messages files on all nodes, the output of an scstat -g command, and the output of a scrgadm -pvv command. Report the problem to your authorized Sun service provider.


984704 reset_rg_state: unable to change state of resource group <%s> on node <%d>; assuming that node died

Description:

The rgmd was unable to reset the state of the specified resource group to offline on the specified node, presumably because the node died.

Solution:

Examine syslog output on the specified node to determine the cause of node death. The syslog output might indicate further remedial actions.


985111 lkcm_reg: illegal %s value

Description:

Cluster information that is being used during udlm registration with ucmm is incorrect.

Solution:

This is an internal error. Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.


985417 %s: Invalid arguments, restarting service.

Description:

The PMF action script supplied by the DSDL while launching the process tree was called with invalid arguments.

Solution:

This is an internal error. Contact your authorized Sun service provider for assistance in diagnosing and correcting the problem.


986190 Entry at position %d in property %s with value %s is not a valid node identifier or node name.

Description:

The value given for the named property has an invalid node specified for it. The position index, which starts at 0 for the first element in the list, indicates which element in the property list was invalid.

Solution:

Specify a valid node for the property.


986197 reservation fatal error(%s) - malloc() error, errno %d

Description:

The device fencing program has been unable to allocate required memory.

Solution:

Memory usage should be monitored on this node and steps taken to provide more available memory if problems persist. Once memory has been made available, the following steps may need to taken: If the message specifies the 'node_join' transition, then this node may be unable to access shared devices. If the failure occurred during the 'release_shared_scsi2' transition, then a node which was joining the cluster may be unable to access shared devices. In either case, access to shared devices can be reacquired by executing '/usr/cluster/lib/sc/run_reserve -c node_join' on all cluster nodes. If the failure occurred during the 'make_primary' transition, then a device group has failed to start on this node. If another node was available to host the device group, then it should have been started on that node. The device group can be switched back to this node if desired by using the scswitch command. If no other node was available, then the device group will not have been started. The scswitch command may be used to start the device group. If the failure occurred during the 'primary_to_secondary' transition, then the shutdown or switchover of a device group has failed. The desired action may be retried.


986466 clexecd: stat of '%s' failed

Description:

clexecd problem failed to stat the directory indicated in the error message.

Solution:

Make sure the directory exists. If the problem persists, contact your authorized Sun service provider to determine whether a workaround or patch is available.


987455 in libsecurity weak Unix authorization failed

Description:

A server (rgmd) refused an rpc connection from a client because it failed the Unix authentication. This happens if a caller program using scha public api, either in its C form or its CLI form, is not running as root. An error message is output to syslog.

Solution:

Check that the calling program using the scha public api is running as root. If the program is running as root, save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


987601 scvxvmlg error - opendir(%s) failed

Description:

The program responsible for maintaining the VxVM namespace was unable to access the global device namespace. If configuration changes were recently made to VxVM diskgroups or volumes, this node may be unaware of those changes. Recently created volumes may be unaccessible from this node.

Solution:

Verify that the /global/.devices/node@N (N = this node's node number) is mounted globally and is accessible. If no configuration changes have been recently made to VxVM diskgroups or volumes and all volumes continue to be accessible from this node, then no further action is required. If changes have been made, the device namespace on this node can be updated to reflect those changes by executing '/usr/cluster/lib/dcs/scvxvmlg'. If the problem persists, contact your authorized Sun service provider to determine whether a workaround or patch is available.


988355 scha_control: rejecting RESOURCE_IS_RESTARTED call on resource <%s> because it does not have a Retry_interval property

Description:

A resource monitor (or some other program) is attempting to notify the RGM that the indicated resource has been restarted, by calling scha_control(1ha),(3ha) with the RESOURCE_IS_RESTARTED option. This request is rejected because the resource type does not declare the Retry_interval property for its resources. This represents a bug in the calling program. To enable the RESOURCE_IS_RESTARTED functionality, the resource type registration (RTR) file must declare the Retry_interval property.

Solution:

Contact the author of the data service (or of whatever program is attempting to call scha_control) and report the error.


988416 t_sndudata (2) in send_reply: %s

Description:

Call to t_sndudata() failed. The "t_sndudata" man page describes possible error codes.

Solution:

None.


988719 Warning: Unexpected result returned while checking for the existence of scalable service group %s: %d.

Description:

A call to the underlying scalable networking code failed.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


988762 Invalid connection attempted from %s: %s

Description:

Solution:


988885 libpnm error: %s

Description:

This means that there is an error either in libpnm being able to send the command to the PNM daemon or in libpnm receiving a response from the PNM daemon.

Solution:

The user of libpnm should handle these errors. However, if the message is: network is too slow - it means that libpnm was not able to read data from the network - either the network is congested or the resources on the node are dangerously low. scha_cluster_open failed - it means that the call to initialize a handle to get cluster information failed. This means that the command will not be sent to the PNM daemon. scha_cluster_get failed - it means that the call to get cluster information failed. This means that the command will not be sent to the PNM daemon. can't connect to PNMd - it means that libpnm was not able to connect to the PNM daemon through the private interconnects. There could be other related error messages. wrong version of PNMd - it means that we connected to a PNM daemon which did not give us the correct version number.


988937 Extension property <failover_enabled> has a value of <%d>

Description:

Resource property failover_enabled is set to a value or has a default value. The value '1' means TRUE and '0' means FALSE.

Solution:

This is an informational message, no user action is needed.


989693 thr_create failed

Description:

Could not create a new thread. The "thr_create" man page describes possible error codes.

Solution:

Some system resource has been exceeded. Install more memory, increase swap space or reduce peak memory consumption.


989846 ERROR: unpack_rg_seq(): rgname_to_rg failed <%s>

Description:

Due to an internal error, the rgmd was unable to find the specified resource group data in memory.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


989958 wrong command length received %d

Description:

This means that the PNM daemon received a command from libpnm, but all the bytes were not received.

Solution:

This is not a serious error. It could be happening due to some network problems. If the error persists send KILL (9) signal to pnmd. PMF will restart pnmd automatically. If the problem persists, restart the node with scswitch -S and shutdown(1M).


990215 HA: repl_mgr: exception while invoking RMA reconf object

Description:

An unrecoverable failure occurred in the HA framework.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


990418 received signal %d

Description:

The daemon indicated in the message tag (rgmd or ucmmd) has received a signal, possibly caused by an operator-initiated kill(1) command. The signal is ignored.

Solution:

The operator must use scswitch(1M) and shutdown(1M) to take down a node, rather than directly killing the daemon.


991108 uaddr2taddr (open_cmd_port) failed

Description:

Call to uaddr2taddr() failed. The "uaddr2taddr" man page describes possible error codes. ucmmd will exit and the node will abort.

Solution:

Save the files /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


991130 pthread_create: %s

Description:

The rpc.pmfd server was not able to allocate a new thread, probably due to low memory, and the system error is shown. This can happen when a new tag is started, or when monitoring for a process is set up. If the error occurs when a new tag is started, the tag is not started and pmfadm returns error. If the error occurs when monitoring for a process is set up, the process is not monitored. An error message is output to syslog.

Solution:

Investigate if the machine is running out of memory. If this is not the case, save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


991800 in libsecurity transport %s is not a loopback transport

Description:

A server (rpc.pmfd, rpc.fed or rgmd) refused an rpc connection from a client because the named transport is not a loopback. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


991864 putenv: %s

Description:

The rpc.pmfd server was not able to change environment variables. The message contains the system error. The server does not perform the action requested by the client, and an error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


992912 clexecd: thr_sigsetmask returned %d. Exiting.

Description:

clexecd program has encountered a failed thr_sigsetmask(3THR) system call. The error message indicates the error number for the failure.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


992998 clconf: CSR registration failed

Description:

While executing task in clconf and modifying the state of proxy, received failure from registering CSR.

Solution:

This is informational message. No user action required.


994915 %s: Cannot get transport information.

Description:

The daemon is unable to get needed information about transport over which it provides RPC service.


995026 lkcm_cfg: invalid handle was passed %s %d

Description:

Handle for communication with udlmctl during a call to return the current DLM configuration is invalid.

Solution:

This is an internal error. Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.


995339 Restarting using scha_control RESTART

Description:

Fault monitor has detected problems in RDBMS server. Attempt will be made to restart RDBMS server on the same node.

Solution:

Check the cause of RDBMS failure.


995532 Could not bring up some ip addresses.


995859 scha_cluster_get failed

Description:

Call to get cluster information failed. This means that the incoming connection to the PNM daemon will not be accepted.

Solution:

There could be other related error messages which might be helpful. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


996075 fatal: Unable to resolve %s from nameserver

Description:

The low-level cluster machinery has encountered a fatal error. The rgmd will produce a core file and will cause the node to halt or reboot to avoid the possibility of data corruption.

Solution:

Save a copy of the /var/adm/messages files on all nodes, and of the rgmd core file. Contact your authorized Sun service provider for assistance in diagnosing the problem.


996887 reservation message(%s) - attempted removal of scsi-3 keys from non-scsi-3 device %s

Description:

The device fencing program has detected scsi-3 registration keys on a a device which is not configured for scsi-3 PGR use. The keys have been removed.

Solution:

This is an informational message, no user action is needed.


996897 Method <%s> on resource <%s>: stat of program file failed.

Description:

The rgmd was unable to access the indicated resource method file. This may be caused by incorrect installation of the resource type.

Solution:

Consult resource type documentation; [re-]install the resource type, if necessary.


996902 Stopped the HA-NFS system fault monitor.

Description:

The HA-NFS system fault monitor was stopped successfully.

Solution:

No action required.


997568 modinstall of tcpmod failed

Description:

Streams module that intercepts private interconnect communication could not be installed.


997689 IP address %s is an IP address in resource %s and in resource %s.

Description:

The same IP address is being used in two resources. This is not a correct configuration.

Solution:

Delete one of the resources that is using the duplicated IP address.


998022 Failed to restart the service: %s.

Description:

Restart attempt of the data service has failed.

Solution:

Check the sylog messages that are occurred just before this message to check whether there is any internal error. In case of internal error, contact your Sun service provider. Otherwise, any of the following situations may have happened. 1) Check the Start_timeout and Stop_timeout values and adjust them if they are not appropriate. 2) This might be the result of lack of the system resources. Check whether the system is low in memory or the process table is full and take appropriate action.