Sun Cluster 3.0 5/02 Error Messages Guide

Chapter 5 Message IDs 400000 - 499999

Error Message List

The following list is ordered by the message ID.

400592 :UNIX DLM is asking for a reconfiguration to recover from a communication error. This message is acceptable during a reconfiguration already in progress.

Description:

The cluster will reconfigure.

Solution:

None.

400855:Processes on $HOSTNAME are stopped.

Description:

The Sun Cluster HA for BroadVision One-To-One Enterprise processes on the specified host have stopped.

Solution:

No user action required.

401115 :t_rcvudata (recv_request) failed

Description:

Call to t_rcvudata() failed. The "t_rcvudata" man page describes possible error codes. ucmmd will exit and the node will abort.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.

401400 :Successfully stopped the application

Description:

This message is to inform the administrator that the application was topped successfully.

Solution:

This is an informational message, no user action is needed.

401573 :INTERNAL ERROR: START method not registered for resource <%s>

Description:

A non-fatal internal error has occurred in the rgmd state machine.

Solution:

Since this problem might indicate an internal logic error in the rgmd, please save a copy of the /var/adm/messages files on all nodes, the output of an scstat -g command, and the output of a scrgadm -pvv command. Report the problem to your authorized Sun service provider.

402289 :t_bind: %s

Description:

Call to t_bind() failed. The "t_bind" man page describes possible error codes. udlm will exit and the node will abort.

Solution:

This is an internal error. Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.

402484 :NULL command string passed.

Description:

A NULL value was specified for the command argument.

Solution:

Specify a non-NULL value for the command string.

403257 :Failed to start Backup server.

Description:

Sun Cluster HA for Sybase failed to start the backup server. Other syslog messages and the log file will provide additional information on possible reasons for the failure.

Solution:

Determine whether the server can be started manually. Examine the HA-Sybase log files, backup server log files and setup.

404309 :in libsecurity cred flavor is not AUTH_SYS

Description:

A server (rpc.pmfd, rpc.fed or rgmd) refused an rpc connection from a client because the authorization is not of UNIX type. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.

404309 :in libsecurity cred flavor is not AUTH_SYS

Description:

A server (rpc.pmfd, rpc.fed or rgmd) refused an rpc connection from a client because the authorization is not of UNIX type. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.

404866 :method_full_name: malloc failed

Description:

The rgmd server was not able to create the full name of the method, while trying to connect to the rpc.fed server, probably due to low memory. An error message is output to syslog.

Solution:

Investigate if the host is running out of memory. If not save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.

405030:Hosts in the startup order are not up.The Probe will start the processes on %s.

Description:

The resource groups containing the specified host should be online, but the Sun Cluster HA for BroadVision One-To-One Enterprise processes cannot start because the hosts in the startup order (backend hosts) are not online. The Sun Cluster HA for BroadVision One-To-One Enterprise Probe should wait for these hosts to start before it starts the processes on the specified host.

Solution:

If the resource groups that contain the backend resources are not online, bring them online. If they are online, no user action required because the Sun Cluster HA for BroadVision One-To-One Enterprise processes might be in the process of coming online and the Sun Cluster HA for BroadVision One-To-One Enterprise Probe should take the appropriate action.

405508 :clcomm: Adapter %s has been deleted

Description:

A network adapter has been removed.

Solution:

No action required.

405552 :Unable to contact fault monitor, restarting service.

Description:

The process monitoring facility tried to send a message to the fault monitor noting that the data service application died. It was unable to do so.

Solution:

Since some part (daemon) of the application has failed, it would be restarted. If fault monitor is not yet started, wait for it to be started by Sun Cluster framework. If fault monitor has been disabled, enable it using scswitch.

406522 :resource group %s state on node %s change to RG_PENDING_OFF_STOP_FAILED

Description:

This is a notification from the rgmd that a resource group has had a STOP method failure or timeout on one of its resources. This may be used by system monitoring tools. The resource group will move to the ERROR_STOP_FAILED state on the given node.

Solution:

Refer to the procedure for clearing the ERROR_STOP_FAILED condition on a resource group in the Sun Cluster Administration Guide.

406610 :st_ff_arm failed: %s

Description:

The rpc.pmfd server was not able to initialize the failfast mechanism. This happens while the server is starting up, at boot time. The server does not come up, and an error message is output to syslog. The message contains the system error.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.

406635 :fatal: joiners_run_boot_methods: exiting early because of unexpected exception

Description:

The low-level cluster machinery has encountered a fatal error. The rgmd will produce a core file and will cause the node to halt or reboot to avoid the possibility of data corruption.

Solution:

Save a copy of the /var/adm/messages files on all nodes, and of the rgmd core file. Contact your authorized Sun service provider for assistance in diagnosing the problem.

408214 :Failed to create scalable service group %s: %s.

Description:

A call to the underlying scalable networking code failed.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.

408282 :clcomm: RT or TS classes not configured

Description:

The system requires either real time or time sharing thread scheduling classes for use in user processes. Neither class is available.

Solution:

Configure Solaris to support either real time or time sharing or both thread scheduling classes for user processes.

408672:Removing file %s.

Description:

Sun Cluster HA for NetBackup removes Sun Cluster HA for NetBackup startup and shutdown scripts from /etc/rc2.d and /etc/rc0.d to prevent automatic startup and shutdown of Sun Cluster HA for NetBackup.

Solution:

No user action required.

408742 :svc_setschedprio: Could not save current scheduling parameters: %s

Description:

The server was not able to save the original scheduling mode. The system error message is shown. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.

409267 :Error opening procfs control file (for parent process) <%s> for tag <%s>:%

Description:

The rpc.pmfd server was not able to open the procfs control file for the parent process, and the system error is shown. procfs control files are required in order to monitor user processes.

Solution:

Investigate if the machine is running out of memory. If this is not the case, save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.

409443 :fatal: unexpected exception in rgm_init_pres_state

Description:

This node encountered an unexpected error while communicating with other cluster nodes during a cluster reconfiguration. The rgmd will produce a core file and will cause the node to halt or reboot.

Solution:

Save a copy of the /var/adm/messages files on all nodes, and of the rgmd core file. Contact your authorized Sun service provider for assistance in diagnosing the problem.

409693:Aborting startup: failover of NFS resource groups might be in progress.

Description:

Startup of an NFS resource was aborted because a failure was detected by another resource group, which should be in the process of failing over.

Solution:

Attempt to start the NFS resource after the failover completes. You might need to start the resource on another node if current node is not healthy.

410860 :lkcm_act: cm_reconfigure failed: %s

Description:

ucmm reconfiguration failed. This could also point to a problem with the interconnect components.

Solution:

None if the next reconfiguration succeeds. If not, save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.

410860 :lkcm_act: cm_reconfigure failed: %s

Description:

ucmm reconfiguration failed. This could also point to a problem with the interconnect components.

Solution:

411227:Failed to stop the process with: %s. Retry with SIGKILL.

Description:

The Process Monitor Facility (PMF) failed to stop the data service. It is attempting to stop the data service.

Solution:

Check the Stop_timeout and adjust it, if it is not the appropriate value.

411369 :Not found clexecd on node %d for %d seconds. Giving up!

Description:

Could not find clexecd to execute the program on a node. Indicated giving up after retries.

Solution:

This is an informational message, no user action is needed.

412106 :Internal Error. Unable to get fault monitor name

Description:

This is an internal error. Could not determine fault monitor program name.

Solution:

Please report this problem.

412366 :setsid failed: %s

Description:

Failed to run the "setsid" command. The "setsid" man page describes possible error codes.

Solution:

None. ucmmd will exit.

412533 :clcomm: validate_policy: invalid relationship moderate %d low %d pool %d

Description:

The system checks the proposed flow control policy parameters at system startup and when processing a change request. The moderate server thread level cannot be less than the low server thread level.

Solution:

No user action required.

412558 :inet addr %s length %d = %s

Description:

Information about hosts.

Solution:

None.

413513 :INTERNAL ERROR Failfast: ff_impl_shouldnt_happen.

Description:

An internal error has occurred in the failfast software.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.

413569 :CCR: Invalid CCR table : %s.

Description:

CCR could not find a valid version of the indicated table on the nodes in the cluster.

Solution:

There may be other related messages on the nodes where the failure occurred. They may help diagnose the problem. If the indicated table is unreadable due to disk failure, the root disk on that node needs to be replaced. If the table file is corrupted or missing, boot the cluster in -x mode to restore the indicated table from backup. The CCR tables are located at /etc/cluster/ccr/.

414680 :fatal: register_president: Don't have reference to myself

Description:

The low-level cluster machinery has encountered a fatal error. The rgmd will produce a core file and will cause the node to halt or reboot to avoid the possibility of data corruption.

Solution:

Save a copy of the /var/adm/messages files on all nodes, and of the rgmd core file. Contact your authorized Sun service provider for assistance in diagnosing the problem.

415842 :fatal: scswitch_onoff: invalid opcode <%d>

Description:

While attempting to execute an operator-requested enable or disable of a resource, the rgmd has encountered an internal error. This error should not occur. The rgmd will produce a core file and will force the node to halt or reboot to avoid the possibility of data corruption.

Solution:

Save a copy of the /var/adm/messages files on all nodes, and of the rgmd core file. Contact your authorized Sun service provider for assistance in diagnosing the problem.

416483 :Failed to retrieve the resource information.

Description:

A Sun Cluster data service is unable to retrieve the resource information. Low memory or API call failure might be the reasons.

Solution:

In case of low memory, the problem will probably cured by rebooting. If the problem reoccurs, you might need to increase swap space by configuring additional swap devices. Otherwise, if it is API call failure, check the syslog messages from other components.

416904:Orbixd Probe failed.

Description:

The orbix daemon probe failed.

Solution:

No user action required. The Sun Cluster HA for BroadVision One-To-One Enterprise Probe should take appropriate action.

417144 :Must be root to start %s

Description:

The program or daemon has been started by someone not in superuser mode.

Solution:

Login as root and run the program. If it is a daemon, it may be incorrectly installed. Reinstall cluster packages or contact your service provider.

419220 :%s restore operation failed.

Description:

In the process of creating a shared address resource the system was attempting to reconfigure the IP addresses on the system. The specified operation failed.

Solution:

Use ifconfig command to make sure that all the IP addresses are present. If not, remove the shared address resource and run scrgadm command to recreate it. If problem persists, reboot.

419301 :The probe command <%s> timed out.

Description:

Timeout occurred when executing the probe command provided by user under the hatimerun(1M) utility.

Solution:

This problem may occur when the cluster is under load. You may consider increasing the Probe_timeout property.

419529 :INTERNAL ERROR CMM: Failure registering callbacks.

Description:

An instance of the userland CMM encountered an internal initialization error.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.

419972 :clcomm: Adapter %s is faulted

Description:

A network adapter has encountered a fault.

Solution:

Any interconnect failure should be resolved, and/or a failed node rebooted.

420591:BV Config Error:IMs configured on both physical and private interconnect.

Description:

The Interaction Managers are configured on both the physical host as well as the private host. This is not supported.The Interaction Managers should only be configured on one, either the physical node or the cluster private node.

Solution:

Reconfigure the Sun Cluster HA for BroadVision One-To-One Enterprise servers with IMs only on either physical node or on cluster private IP. Refer to your Sun Cluster HA for BroadVision One-To-One Enterprise documentation.

420763 :Switchover (%s) error (%d) after failure to become secondary

Description:

The file system specified in the message could not be hosted on the node the message came from.

Solution:

Check /var/adm/messages to make sure there were no device errors. If not, contact your authorized Sun service provider to determine whether a workaround or patch is available.

422190:Failed to reboot node: %s.

Description:

Sun Cluster HA for NFS fault monitor was attempting to reboot the node, because rpcbind daemon was unresponsive. However, the attempt to reboot the node itself did not succeed.

Solution:

Manually reboot the node. Also see message id 804791.

422214 :CMM: Votecount changed from %d to %d for quorum device %ld (%s).

Description:

The votecount for the specified quorum device has been changed as indicated.

Solution:

This is an informational message, no user action is needed.

422541 :Failed to register with PDTserver

Description:

This means that we have lost communication with PDT server. Scalable services will not work any more. Probably, the nodes which are configured to be the primaries and secondaries for the PDT server are down.

Solution:

Need to restart any of the nodes which are configured be the primary or secondary for the PDT server.

423538 :WARNING: UDLM_PROCEED was picked up by a lkcm_act, returning LKCM_NOOP

Description:

An internal warning during udlm state update.

Solution:

None.

423958 :resource group %s state change to unmanaged.

Description:

This is a notification from the rgmd that a resource group's state has changed. This may be used by system monitoring tools.

Solution:

This is an informational message, no user action is needed.

424061 :Validation failed. ORACLE_HOME %s does not exist

Description:

Directory specified as ORACLE_HOME does not exist. ORACLE_HOME property is specified when creating Oracle_server and Oracle_listener resources.

Solution:

Specify correct ORACLE_HOME when creating resource. If resource is already created, please update resource property 'ORACLE_HOME'.

424095 :scvxvmlg fatal error - %s does not exist

Description:

The program responsible for maintaining the VxVM namespace was unable to access the global device namespace. If configuration changes were recently made to VxVM diskgroups or volumes, this node may be unaware of those changes. Recently created volumes may be inaccessible from this node.

Solution:

Verify that the /global/.devices/node@N (N = this node's node number) is mounted globally and is accessible. If no configuration changes have been recently made to VxVM diskgroups or volumes and all volumes continue to be accessible from this node, then no further action is required. If changes have been made, the device namespace on this node can be updated to reflect those changes by executing '/usr/cluster/lib/dcs/scvxvmlg'. If the problem persists, contact your authorized Sun service provider to determine whether a workaround or patch is available.

424774 :Resource group <%s> requires operator attention due to STOP failure

Description:

This is a notification from the rgmd that a resource group has had a STOP method failure or timeout on one of its resources. The resource group is in ERROR_STOP_FAILED state. This may cause another operation such as scswitch(1M), scrgadm(1M), or scha_control(1HA,3HA) to fail with a SCHA_ERR_STOPFAILED error.

Solution:

Refer to the procedure for clearing the ERROR_STOP_FAILED condition on a resource group in the Sun Cluster Administration Guide.

424816 :Unable to set automatic MT mode.

Description:

The rpc.pmfd server was not able to set the multi-threaded operation mode. This happens while the server is starting up, at boot time. The server does not come up, and an error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.

424834 :Failed to connect to %s port %d for secure resource %s.

Description:

An error occurred while the fault monitor tried to connect to a port specified in the Port_list property for this secure resource.

Solution:

Ensure that the Port_list property is set to the same port number that the Netscape Directory Server is running on.

425053 :CCR: Can't access table %s while updating it on node %s errno = %d.

Description:

The indicated error occurred while updating the indicated table on the indicated node. The errno value indicates the nature of the problem. errno values are defined in the file /usr/include/sys/errno.h. An errno value of 28 (ENOSPC) indicates that the root file system on the node is full. Other values of errno can be returned when the root disk has failed (EIO) or some of the CCR tables have been deleted outside the control of the cluster software (ENOENT).

Solution:

There may be other related messages on the node where the failure occurred. These may help diagnose the problem. If the root file system is full on the node, then free up some space by removing unnecessary files. If the indicated table was accidently deleted, then boot the offending node in -x mode to restore the indicated table from other nodes in the cluster. The CCR tables are located at /etc/cluster/ccr/. If the root disk on the afflicted node has failed, then it needs to be replaced.

425551 :getnetconfigent (open_cmd_port) failed

Description:

Call to getnetconfigent failed and ucmmd could not get network information. ucmmd will exit and the node will abort.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.

426221 :CMM: Reservation key changed from %s to %s for node %s (id = %d).

Description:

The reservation key for the specified node was changed. This can only happen due to the CCR infrastructure being changed by hand, which is not a supported operation. The system can not continue, and the node will panic.

Solution:

Boot the node in non-cluster (-x) mode, recover a good copy of the file /etc/cluster/ccr/infrastructure from one of the cluster nodes or from backup, and then boot this node back in cluster mode. If all nodes in the cluster exhibit this problem, then boot them all in non-cluster mode, make sure that the infrastructure files are the same on all of them, and boot them back in cluster mode. The problem should not happen again.

426678 :rgmd died

Description:

An inter-node communication failed because the rgmd died on another node. To avoid data corruption, the failfast mechanism will cause that node to halt or reboot.

Solution:

No action is required. The cluster will reconfigure automatically. Examine syslog output on the rebooted node to determine the cause of node death. The syslog output might indicate further remedial actions.

429663 :Node %s not in list of configured nodes

Description:

The specified scalable service could not be started on this node because the node is not in the list of configured nodes for this particular service.

Solution:

If the specified service needs to be started on this node, use scrgadm to add the node to the list of configured nodes for this service and then restart the service.

429819:Monitor_retry_interval is not set.

Description:

The resource property Monitor_retry_interval is not set. This property specifies the time interval between two restarts of the fault monitor.

Solution:

Ensure that this property is set. Use the scrgadm(1M) command to set this property.

429820:NetBackup daemon <%s> is not running.

Description:

One of the Sun Cluster HA for NetBackup master daemons ("bprd", "bpdbm", or "vmd") is not running.

Solution:

No user action required.

429907 :clexecd: waitpid returned %d. Returning %d to clexecd.

Description:

clexecd program has encountered a failed waitpid(2) system call. The error message indicates the error number for the failure.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.

430445 :Monitor initialization error. Incorrect arguments

Description:

Error occurred in monitor initialization. Arguments passed to the monitor by callback methods were incorrect.

Solution:

This is an internal error. Disable the monitor and report the problem.

432473 :reservation fatal error(%s) - joining_node not specified

Description:

The device fencing program has suffered an internal error.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available. Copies of /var/adm/messages from all nodes should be provided for diagnosis. It may be possible to retry the failed operation, depending on the nature of the error. If the message specifies the 'node_join' transition, then this node may be unable to access shared devices. If the failure occurred during the 'release_shared_scsi2' transition, then a node which was joining the cluster may be unable to access shared devices. In either case, it may be possible to reacquire access to shared devices by executing '/usr/cluster/lib/sc/run_reserve -c node_join' on all cluster nodes. If the failure occurred during the 'make_primary' transition, then a device group has failed to start on this node. If another node was available to host the device group, then it should have been started on that node. If desired, it may be possible to switch the device group to this node with the scswitch command. If no other node was available, then the device group will not have been started. The scswitch command may be used to retry the attempt to start the device group. If the failure occurred during the 'primary_to_secondary' transition, then the shutdown or switchover of a device group has failed. The desired action may be retried.

432987 :Failed to retrieve nodeid.

Description:

Data service is failed to retrieve the host information.

Solution:

If the logical host and shared address entries are specified in the /etc/inet/hosts file, check these entries are correct. If this is not the reason then check the health of the name server. For more error information, check the syslog messages.

433438 :Setup error. SUPPORT_FILE %s does not exist

Description:

This is an internal error. Support file is used by HA-Oracle to determine the fault monitor information.

Solution:

Please report this problem.

433481 :reservation fatal error(%s) - did_get_num_paths()error in is_scsi3_disk(), returned %d.

Description:

The device fencing program has suffered an internal error.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available. Copies of /var/adm/messages from all nodes should be provided for diagnosis. It may be possible to retry the failed operation, depending on the nature of the error. If the message specifies the `node_join' transition, then this node may be unable to access shared devices. If the failure occurred during the `release_shared_scsi2' transition, then a node which was joining the cluster may be unable to access shared devices. In either case, it may be possible to reacquire access to shared devices by executing `/usr/cluster/lib/sc/run_reserve -c node_join' on all cluster nodes. If the failure occurred during the `make_primary' transition, then a device group has failed to start on this node. If another node was available to host the device group, then it should have been started on that node. If desired, it may be possible to switch the device group to this node with the scswitch command. If no other node was available, then the device group will not have been started. The scswitch command may be used to retry the attempt to start the device group. If the failure occurred during the `primary_to_secondary' transition, then the shutdown or switchover of a device group has failed. The desired action may be retried.

433501 :fatal: priocntl: %s (UNIX error %d)

Description:

The daemon indicated in the message tag (rgmd or ucmmd) has encountered a failed system call to priocntl(2). The error message indicates the reason for the failure. The daemon will produce a core file and will force the node to halt or reboot to avoid the possibility of data corruption.

Solution:

If the error message is not self-explanatory, save a copy of the /var/adm/messages files on all nodes, and of the core file generated by the daemon. Contact your authorized Sun service provider for assistance in diagnosing the problem.

433895 :INTERNAL ERROR: Invalid resource property tunable flag <%d> for property <%s>; aborting node

Description:

An internal error occurred in the rgmd while checking whether a resource property could be modified. The rgmd will produce a core file and will force the node to halt or reboot.

Solution:

Look for other syslog error messages on the same node. Save a copy of the /var/adm/messages files on all nodes, and report the problem to your authorized Sun service provider.

434480 :CCR: CCR data server not found.

Description:

The CCR data server could not be found in the local name server.

Solution:

Reboot the node. Also contact your authorized Sun service provider to determine whether a workaround or patch is available.

435521 :Warning: node %d does not have a weight assigned to it for property %s, but node %d is in the %s for resource %s. A weight of %d will be used for node %d.

Description:

The named node does not have a weight assigned to it, but it is a potential master of the resource.

Solution:

No user action is required if the default weight is acceptable. Otherwise, use scrgadm(1M) to set the Load_balancing_weights property to include the node that does not have an explicit weight set for it.

436659 :Failed to start the adaptive server.

Description:

Sun Cluster HA for Sybase failed to start sybase server. Other syslog messages and the log file will provide additional information on possible reasons for the failure.

Solution:

Determine whether the server can be started manually. Examine the HA-Sybase log files, sybase log files and setup.

437100 :Validation failed. Invalid command line parameter %s %s

Description:

Unable to process parameters passed to the call back method. This is an internal error.

Solution:

Please report this problem.

437236 :dl_bind: DLPI error %u

Description:

DLPI protocol error. We cannot bind to the physical device. We are trying to open a fast path to the private transport adapters.

Solution:

Reboot of the node might fix the problem.

437350 :File system associated with mount point %s is to be locally mounted. Local file systems cannot be specified with scalable service resources.

Description:

A HAStoragePlus resource cannot be a part of a resource group of type scalable, if local file systems are specified in the FilesystemMountPoint extension property. The resource group can only be of type failover.

Solution:

Reevaluate the manner in which the HAStoragePlus resource is to be used.

437975 :The property %s cannot be updated because it affects the scalable resource %s.

Description:

The property named is not allowed to be changed after the resource has been created.

Solution:

If the property must be changed, then the resource should be removed and re-added with the new value of the property.

438174:No configuration file ${BV1TO1_VAR}/etc/bv1to1.conf.

Description:

The specified configuration file was not found.

Solution:

Verify that Sun Cluster HA for BroadVision One-To-One Enterprise was properly installed. Ensure that the configuration file is in the correct location.

438420 :Interface %s is plumbed but is not suitable for global networking.

Description:

The specified adapter may be either point to point adapter or loopback adapter which is not suitable for global networking.

Solution:

Reconfigure the appropriate NAFO group to exclude this adapter.

438700 :Some IP addresses might still be on loopback.

Description:

Some of the IP addresses managed by the specified SharedAddress resource were not removed from the loopback interface.

Solution:

Use the ifconfig command to make sure that the IP addresses being managed by the SharedAddress implementation are present either on the loopback interface or on a physical adapter. If they are present on both, use ifconfig to delete them from the loopback interface. Then use scswitch to move the resource group containing the SharedAddresses to another node to make sure that the resource group can be switched over successfully.

438866 :sysinfo in getlocalhostname failed

Description:

sysinfo call did not succeed. The "sysinfo" man page describes possible error codes.

Solution:

This is an internal error. Please report this problem.

439099 :HA: hxdoor %d.%d does not exist on secondary

Description:

An HA framework hxdoor is missing.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.

440406:Cannot check online status. Server processes are running.

Description:

Sun Cluster HA for Sybase could not check the online status of the Sybase Adaptive Server. The Sybase Adaptive Server process is running, but the server might not be online yet.

Solution:

Examine the Connect_string resource property. Make sure that the userid and password specified in the connect string are correct and that permissions are granted to the user for connecting to the server. Check the Sybase Adaptive Server log for error messages.

440530 :Started the fault monitor.

Description:

The fault monitor for this data service was started successfully.

Solution:

No action needed.

440792 :Warning: some resources in resource group <%s> failed to start

Description:

The indicated resource group is pending online. One or more resources' START methods failed to execute successfully. Because the resources' Failover_mode is set to NONE, the resource group is moving to the ONLINE state in spite of the resource start failures.

Solution:

This is a warning message, no user action is needed. The operator may choose to issue an scswitch(1M) command to try switching the affected resource group to another node or to try restarting it on the same node.

441826 :"pmfadm -a" Action failed for <%s>

Description:

The given tag has exceeded the allowed number of retry attempts (given by the 'pmfadm -n' option) and the action (given by the 'pmfadm -a' option) was initiated by rpc.pmfd. The action failed (i.e., returned non-zero), and rpc.pmfd will delete this tag from its tag list and discontinue retry attempts.

Solution:

This message is informational; no user action is needed.

442053 :clcomm: Invalid path_manager client_type (%d)

Description:

The system attempted to add a client of unknown type to the set of path manager clients.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.

442281 :reservation error(%s) - did_get_path() error in other_node_status()

Description:

The device fencing program has suffered an internal error.

Solution:

442767:Failed to stop SAP processes under PMF with SIGKILL.

Description:

Failed to stop Sun Cluster HA for SAP processes with the Process Monitor Facility (PMF) signal. This is an internal error.

Solution:

Save the /var/adm/messages files from all nodes. Contact your authorized Sun service provider.

443271 :clcomm: Pathend: Aborting node because %s for %u ms

Description:

The pathend aborted the node for the specified reason.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.

443479 :CMM: Quorum device %ld with gdevname %s has %d configured path - Ignoring mis-configured quorum device.

Description:

The specified number of configured paths to the specified quorum device is less than two, which is the minimum allowed. This quorum device will be ignored.

Solution:

Reconfigure the quorum device appropriately.

443746 :resource %s state on node %s change to %s

Description:

This is a notification from the rgmd that a resource's state has changed. This may be used by system monitoring tools.

Solution:

This is an informational message, no user action is needed.

444001 :%s: Call failed, return code=%d

Description:

A client was not able to make an rpc connection to a server (rpc.pmfd, rpc.fed or rgmd) to execute the action shown, and was not able to read the rpc error. The rpc error number is shown. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.

444144 :clcomm: Cannot change increment

Description:

An attempt was made to change the flow control policy parameter that specifies the thread increment level. The flow control system uses this parameter to set the number of threads that are acted upon at one time. This value currently cannot be changed.

Solution:

No user action required.

445616 :libsecurity: create of rpc handle to program %ld failed, will not retry

Description:

A client of the rpc.pmfd, rpc.fed or rgmd server was un able to initiate an rpc connection, after multiple retries. The maximum time allowed for connecting has been exceeded, or the types of rpc errors encountered indicate that there is no point in retrying. An accompanying error message shows the rpc error data. The pmfadm or scha command exits with error. The program number is shown. To find out what program corresponds to this number, use the rpcinfo command. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.

446068 :CMM: Node %s (nodeid = %ld) is down.

Description:

The specified node has gone down in that communication with it has been lost.

Solution:

The cause of the failure should be resolved and the node should be rebooted if node failure is unexpected.

446249 :Method <%s> on resource <%s>: authorization error.

Description:

An attempted method execution failed, apparently due to a security violation; this error should not occur. This failure is considered a method failure. Depending on which method was being invoked and the Failover_mode setting on the resource, this might cause the resource group to fail over or move to an error state.

Solution:

Examine other syslog messages occurring at about the same time to see if the problem can be diagnosed. Save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance in diagnosing the problem.

447375 :A SCHA API error occured while retrieving the cluster handle: %s.

Description:

SCHA API's are used to interface with the Resource Group Manager component. It is likely that the RGM is experiencing problems.

Solution:

Inspect the syslog for errors. Contact your authorized Sun service provider to determine whether a workaround or patch is available.

447578 :Duplicated installed nodename when Resource Type <%s> is added.

Description:

User has defined duplicated installed node name when creating resource type.

Solution:

Recheck the installed nodename list and make sure there is no nodename duplication.

447872 :fatal: Unable to reserve %d MBytes of swap space; exiting

Description:

The rgmd was unable to allocate a sufficient amount of memory upon starting up. This is a fatal error. The rgmd will produce a core file and will force the node to halt or reboot to avoid the possibility of data corruption.

Solution:

Make sure that the hardware configuration meets documented minimum requirements. Save a copy of the /var/adm/messages files on all nodes, and of the rgmd core file. Contact your authorized Sun service provider for assistance in diagnosing the problem.

448703 :clcomm: validate_policy: high too small. high %d low %d nodes %d pool %d

Description:

The system checks the proposed flow control policy parameters at system startup and when processing a change request. The high server thread level must be large enough to grant the low number of threads to all of the nodes identified in the message for a fixed size resource pool.

Solution:

No user action required.

448844 :clcomm: inbound_invo::done: state is 0x%x

Description:

The internal state describing the server side of a remote invocation is invalid when the invocation completes server side processing.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.

448898 :%s.nodes entry in the configuration file must be between 1 and %d.

Description:

Illegal value for a node number. Perhaps the system is not booted as part of the cluster.

Solution:

Make sure the node is booted as part of a cluster.

449159 :clconf: No valid quorum_vote field for node %u

Description:

Found the quorum vote field being incorrect while converting the quorum configuration information into quorum table.

Solution:

Check the quorum configuration information.

449979:ALL the daemons are running on $HOSTNAME.

Description:

This message if from the Sun Cluster HA for BroadVision One-To-One Enterprise Probe. All daemons are running.

Solution:

No user action required.

449288 :setgid: %s

Description:

The rpc.pmfd server was not able to set the group id of a process. The message contains the system error. The server does not perform the action requested by the client, and an error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.

449336 :setsid: %s

Description:

The rpc.pmfd or rpc.fed server was not able to set the session id, and the system error is shown. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.

449344 :setuid: %s

Description:

The rpc.pmfd server was not able to set the user id of a process. The message contains the system error. The server does not perform the action requested by the client, and an error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.

449907 :scvxvmlg error - mknod(%s) failed

Description:

Solution:

450173 :Error accessing policy

Description:

This message appears when the customer is initializing or changing a scalable services load balancer, by starting or updating a service. The Load_Balancing_Policy is missing.

Solution:

Add a Load_Balancing_Policy parameter when creating the resource group.

450780 :Error: Unable to create scha_control timestamp file <%s> for resource <%s>

Description:

The rgmd has failed in an attempt to create a file used for the anti-"pingpong" feature. This may prevent the anti-pingpong feature from working, which may permit a resource group to fail over repeatedly between two or more nodes. The failure to create the file might indicate a more serious problem on the node.

Solution:

Examine other syslog messages occurring around the same time on the same node, to see if the source of the problem can be identified.

451640 :tag %s: stat of command file %s failed

Description:

The rpc.fed server checked the command path indicated by the tag, and this check failed, possibly because the path is incorrect. An error message is output to syslog.

Solution:

Check the path of the command.

452150 :Failed to start the fault monitor.

Description:

Process monitor facility has failed to start the fault monitor.

Solution:

Check whether the system is low in memory or the process table is full and correct these problems. If the error persists, use scswitch to switch the resource group to another node.

452202 :clcomm: sdoor_sendstream::send

Description:

This operation should never occur.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.

452205 :Failed to form the %s command.

Description:

The method searches the command input to the Agent Builder for the occurrence of specific Builder defined variables, e.g., $hostnames, and replaces them with appropriate value. This action failed.

Solution:

Check syslog messages and correct the problems specified in prior syslog messages. If the error still persists, please report this problem.

452604 :CMM: Registered key on and acquired quorum device %ld (gdevname %s).

Description:

When this node was booting up, it had found only non-cluster member keys on the specified device. After joining the cluster and having its CCR recovered, this node has been able to register its keys on this device and is its owner.

Solution:

This is an informational message, no user action is needed.

453919 :Pathprefix is not set for resource group %s.

Description:

Resource Group property Pathprefix is not set.

Solution:

Use scrgadm to set the Pathprefix property on the resource group.

453207:some BV processes are still running on $HOSTNAME.

Description:

All the Sun Cluster HA for BroadVision One-To-One Enterprise processes could not be stopped and some Sun Cluster HA for BroadVision One-To-One Enterprise processes are still running.

Solution:

No user action required. The service method should stop the Sun Cluster HA for BroadVision One-To-One Enterprise processes.

454214 :Resource %s is associated with scalable resource group %s. AffinityOn set to TRUE will be ignored.

Description:

Warning message. The HAStoragePlus resource is associated with a resource group which is of type scalable. The AffinityOn extension property is will always be FALSE for a scalable resource group.

Solution:

A warning message only. No action is needed.

454247 :Error: Unable to create directory <%s> for scha_control timestamp file

Description:

The rgmd is unable to access the directory used for the anti-"pingpong" feature, and cannot create the directory (which should already exist). This may prevent the anti-pingpong feature from working, which may permit a resource group to fail over repeatedly between two or more nodes. The failure to access or create the directory might indicate a more serious problem on the node.

Solution:

Examine other syslog messages occurring around the same time on the same node, to see if the source of the problem can be identified.

454607 :INTERNAL ERROR: Invalid resource extension property type <%d> on resource <%s>; aborting node

Description:

An attempted creation or update of a resource has failed because of invalid resource type data. This may indicate CCR data corruption or an internal logic error in the rgmd. The rgmd will produce a core file and will force the node to halt or reboot.

Solution:

Use scrgadm(1M) -pvv to examine resource properties. If the resource or resource type properties appear to be corrupted, the CCR might have to be rebuilt. If values appear correct, this may indicate an internal error in the rgmd. Retry the creation or update operation. If the problem recurs, save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance.

454930 :Scheduling class %s not configured

Description:

An attempt to change the thread scheduling class failed, because the scheduling class was not configured.

Solution:

Configure the system to support the desired thread scheduling class.

455738 :Global service % associated with path %s is unavailable. Retrying...

Description:

A warning message indicating that a given global service is detected to be unavailable. This message is logged within a loop till the service is confirmed as available.

Solution:

Ensure syslog for errors. Ensure that the global services are configured correctly.

457114 :fatal: death_ff->arm failed

Description:

The daemon specified in the error tag was unable to arm the failfast device. The failfast device kills the node if the daemon process dies either due to hitting a fatal bug or due to being killed inadvertently by an operator. This is a requirement to avoid the possibility of data corruption. The daemon will produce a core file and will cause the node to halt or reboot

Solution:

Save a copy of the /var/adm/messages files on all nodes, and of the core file generated by the daemon. Contact your authorized Sun service provider for assistance in diagnosing the problem.

457121 :Failed to retrieve the host information for %s: %s.

Description:

The data service failed to retrieve the host information.

Solution:

If the logical hostname and shared address entries are specified in the /etc/inet/hosts file, check that the entries are correct. Verify the settings in the /etc/nsswitch.conf file include "files" for host lookup. If these are correct, check the health of the name server. For more error information, check the syslog messages.

458091 :CMM: Reconfiguration delaying for %d seconds to allow larger partitions to win race for quorum devices.

Description:

In the case of potential split brain scenarios, the CMM allows larger partitions to win the race to acquire quorum devices by forcing the smaller partitions to sleep for a time period proportional to the number of nodes not in that partition.

Solution:

This is an informational message, no user action is needed.

458530 :Method <%s> on resource <%s>: program file is not executable.

Description:

A method pathname points to a file that is not executable. This may have been caused by incorrect installation of the resource type.

Solution:

Identify registered resource type methods using scrgadm(1M) -pvv. Check the permissions on the resource type methods. Reinstall the resource type if necessary, following resource type documentation.

458818 :reservation fatal error(%s) - disk_file not specified

Description:

The device fencing program has suffered an internal error.

Solution:

458988 :libcdb: scha_cluster_open failed with %d

Description:

Call to initialize a handle to get cluster information failed. The second part of the message gives the error code.

Solution:

The calling program should handle this error. If it is not recoverable, it will exit.

460027 :Resource <%s> of Resource Group <%s> failed sanity check on node <%s>\n

Description:

Message logged for failed scha_control sanity check methods on specific node.

Solution:

No user action required.

460520 :scvxvmlg fatal error - dcs_get_service_names_of_class(%s) failed, returned %d

Description:

The program responsible for maintaining the VxVM namespace has suffered an internal error. If configuration changes were recently made to VxVM diskgroups or volumes, this node may be unaware of those changes. Recently created volumes may be inaccessible from this node.

Solution:

If no configuration changes have been recently made to VxVM diskgroups or volumes and all volumes continue to be accessible from this node, then no action is required. If changes have been made, the device namespace on this node can be updated to reflect those changes by executing '/usr/cluster/lib/dcs/scvxvmlg'. If the problem persists, contact your authorized Sun service provider to determine whether a workaround or patch is available.

462083 :fatal: Resource <%s> update failed with error <%d>; aborting node

Description:

Rgmd failed to read updated resource from the CCR on this node.

Solution:

Save a copy of the /var/adm/messages files on all nodes, and of the rgmd core file. Contact your authorized Sun service provider for assistance in diagnosing the problem.

462632 :HA: repl_mgr: exception invalid_repl_prov_state %d

Description:

The system did not perform this operation on the primary object.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.

463684 :Error in opening %s: %s.

Description:

The file specifying file system mountpoints (default is /etc/vfstab) could not be opened for reading.

Solution:

Check if this file is present and readable. Contact your authorized Sun service provider to determine whether a workaround or patch is available.

464588 :Failed to retrieve the resource group property %s: %s.

Description:

An API operation has failed while retrieving the resource group property. Low memory or API call failure might be the reasons.

Solution:

465065 :Error accessing group

Description:

This message appears when the customer is initializing or changing a scalable services load balancer, by starting or updating a service. The specified resource group is invalid.

Solution:

Check the resource group name specified and make sure that a valid value is used.

466896:Could not create file %s: %s.

Description:

Failed to create file. This failure might occur if the file has invalid permissions. This failure might also occur if there is a lack of system resources.

Solution:

Ensure the file has valid permissions. If the system is low in memory, take appropriate action.

468477 :Failed to retrieve the property %s: %s.

Description:

An internal error occurred in the rgmd while checking a cluster property.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.

468732 :Too many modules configured for autopush.

Description:

The system attempted to configure a clustering STREAMS module for autopush but too many modules were already configured.

Solution:

Check in your /etc/iu.ap file if too many modules have been configured to be autopushed on a network adapter. Reduce the number of modules. Use autopush(1m) command to remove some modules from the autopush configuration.

469417 :Failfast: timeout - unit \"%s\"%s.

Description:

A failfast client has encountered a timeout and is going to panic the node.

Solution:

There may be other related messages on this node which may help diagnose the problem. Resolve the problem and reboot the node if node panic is unexpected.

469817:Specified resource group does not exist: %s.

Description:

The name of a specified resource group is invalid. Such a resource group does not exist.

Solution:

This probably is the result of specifying an incorrect resource group name in an dependency, or an extension property of a resource, or resource group. Please repeat the steps which led to this error using an existing resource group name.

470970:Device switchovers cannot be performed since resource group is scalable.

Description:

A warning message to the effect that device switchovers are being ignored since the resource group is scalable i.e AffinityOn is always FALSE.

Solution:

An informational message only. No action is needed.

471241:Probing SAP Message Server times out with command %s.

Description:

Checking Sun Cluster HA for SAP message server with utility lgtst times out. This might occur under heavy system load.

Solution:

Increase the Probe_timeout property, or switch the resource group to another node using scswitch(1M).

472185 :Failed to retrieve the resource group property %s for %s: %s.

Description:

The query for a property failed. The reason for the failure is given in the message.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.

473021 :in libsecurity uname sys call failed: %s

Description:

A client was not able to make an rpc connection to a server (rpc.pmfd, rpc.fed or rgmd) because the host name could not be obtained. The system error message is shown. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.

473460 :Method <%s> on resource <%s>: authorization error: %s.

Description:

An attempted method execution failed, apparently due to a security violation; this error should not occur. The last portion of the message describes the error. This failure is considered a method failure. Depending on which method was being invoked and the Failover_mode setting on the resource, this might cause the resource group to fail over or move to an error state.

Solution:

Correct the problem identified in the error message. If necessary, examine other syslog messages occurring at about the same time to see if the problem can be diagnosed. Save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance in diagnosing the problem.

473653 :Failed to retrieve the resource type handle for %s while querying for property %s: %s.

Description:

Access to the object named failed. The reason for the failure is given in the message.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.

474256 :Validations of all specified global device services complete.

Description:

All global device services associated with both GlobalDevicePaths and FilesystemMountPoint extension properties have been validated successfully and are found to be in the normal state. The RGM and DSDL components are found to be in the normal state.

Solution:

An informational message only. No action is needed.

474690 :clexecd: Error %d from send_fd

Description:

clexecd program has encountered a failed fcntl(2) system call. The error message indicates the error number for the failure.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.

475303:Could not start BV 121 daemons on $HOSTNAME.

Description:

The Sun Cluster HA for BroadVision One-To-One Enterprise servers could not start on the specified host. This failure occurs if the orbixd does not properly start or if there are configuration errors.

Solution:

See if there are any internal errors. Verify the Sun Cluster HA for BroadVision One-To-One Enterprise configuration. Manually start Sun Cluster HA for BroadVision One-To-One Enterprise on the specified host. If the orbixd daemon does not start, contact your authorized Sun service provider for assistance. Provide your authorized Sun service provider a copy of the /var/adm/messages files from all nodes.

475398 :Out of memory (memory allocation failed):%s.%s

Description:

There is not enough swap space on the system.

Solution:

Add more swap space. See swap(1M) for more details.

476996 :Failed to unregister callback for NAFO group %s with tag %s (request failed with %d).

Description:

An unexpected error occurred while trying to communicate with the network monitoring daemon (pnmd).

Solution:

Make sure the network monitoring daemon (pnmd) is running.

477296:Validation failed. SYBASE ASE STOP_FILE %s not found.

Description:

File specified in the STOP_FILE extension property was not found. Or the file specified is not an ordinary file.

Solution:

Please check that file specified in the STOP_FILE extension property exists on all the nodes.

477378 :Failed to restart the service.

Description:

Restart attempt of the dataservice is failed.

Solution:

Check the sylog messages that are occurred just before this message to check whether there is any internal error. In case of internal error, contact your Sun service provider. Otherwise, any of the following situations may have happened. 1) Check the Start_timeout and Stop_timeout values and adjust them if they are not appropriate. 2) This might be the result of lack of the system resources. Check whether the system is low in memory or the process table is full and take appropriate action.

477828 :This node is not in the replica nodelist of global service %s associated with path %s. No device switchover can be done to this node.

Description:

Attempts to designate this cluster node as a primary for the global (DCS) service will fail. In other words, the device service cannot be switched i.e. made primary on this node.

Solution:

Ensure that the resource group's potential primary list and the device serice's potential primary list are identical.

478523 :Could not mount '%s' because there was an error (%d) in opening the directory.

Description:

While mounting a Cluster file system, the directory on which the mount is to take place could not be opened.

Solution:

Fix the reported error and retry. The most likely problem is that the directory does not exist - in that case, create it with the appropriate permissions and retry.

479105 :Cannot get service status for global service <%s> of path <%s>

Description:

Cannot get status for the global service. This is a severe problem.

Solution:

Contact your authorized Sun service provider to determine what is the cause of the problem.

479213 :Monitor server terminated.

Description:

Graceful shutdown did not succeed. Monitor server processes were killed in STOP method. It is likely that adaptive server terminated prior to shutdown of monitor server.

Solution:

Please check the permissions of file specified in the STOP_FILE extension property. File should be executable by the Sybase owner and root user.

479432 :The application process tree has died and the action to be taken as determined by scds_fm_history is to failover. However the application is not being failed over because the failover_enabled extension property is set to false. Restarting the application instead.

Description:

Property failover_enabled is set to false. The probe is trying to restart application locally, instead of failover.

Solution:

This is an informational message, no user action is needed

479442 :in libsecurity could not allocate memory

Description:

A server (rpc.pmfd, rpc.fed or rgmd) was not able to start, or a client was not able to make an rpc connection to the server, probably due to low memory. An error message is output to syslog.

Solution:

Investigate if the host is low on memory. If not, save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.

483160 :Failed to connect to socket: %s.

Description:

While determining the health of the resource, process monitor facility has failed to communicate with the resource fault monitor.

Solution:

Any of the following situations might have occurred. 1) Check whether the fault monitor is running, if not wait for the fault monitor to start. 2) Check whether the fault monitor is disabled, if it is then user can enable the fault monitor, otherwise ignore it. 3) In all other situations, consider it as an internal error. Save /var/adm/messages file and contact your authorized Sun service provider. For more error description check the syslog messages.

483528:NULL value returned for resource name.

Description:

A null value was returned for resource name.

Solution:

Verify the resource name.

484084 :INTERNAL ERROR: non-existent resource <%s> appears in dependency list of resource <%s>

Description:

While attempting to execute an operator-requested enable of a resource, the rgmd has found a non-existent resource to be listed in the Resource_dependencies or Resource_dependencies_weak property of the indicated resource. This suggests corruption of the RGM's internal data but is not fatal.

Solution:

Use scrgadm(1M) -pvv to examine resource group properties. If the values appear corrupted, the CCR might have to be rebuilt. If values appear correct, this may indicate an internal error in the rgmd. Contact your authorized Sun service provider for assistance in diagnosing and correcting the problem.

484513:Failed to retrieve the probe command with error <%d>. Will continue to do the simple probe.

Description:

The fault monitor failed to retrieve the probe command from the cluster configuration. It will continue using the simple probe to monitor the application.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.

485464 :clcomm: Failed to allocate simple xdoor server %d

Description:

The system could not allocate a simple xdoor server. This can happen when the xdoor number is already in use. This message is only possible on debug systems.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.

485759 :transition '%s' failed for cluster '%s'

Description:

The mentioned state transition failed for the cluster. udlmctl will exit.

Solution:

This is an internal error. Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.

485942 :(%s) sigprocmask failed: %s (UNIX errno %d)

Description:

Call to sigprocmask() failed. The "sigprocmask" man page describes possible error codes. udlmctl will exit.

Solution:

Save a copy of the /var/adm/messages files on all nodes, and of the rgmd core file. Contact your authorized Sun service provider for assistance in diagnosing the problem.

487022 :The networking components for scalable resource %s have been configured successfully for method %s.

Description:

The calls to the underlying scalable networking code succeeded.

Solution:

This is an informational message, no user action is needed.

487484 :lkcm_reg: lib initialization failed

Description:

udlm could not register with cmm because lib initialization failed.

Solution:

This is an internal error. Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.

487574 :Failed to alloc memory

Description:

A scha_control call failed because the system has run out of swap space. The system is likely to halt or reboot if swap space continues to be depleted.

Solution:

Investigate the cause of swap space depletion and correct the problem, if possible.

487827 CCR: Waiting for repository synchronization to finish.

Description:

This node is waiting to finish the synchronization of its repository with other nodes in the cluster before it can join the cluster membership.

Solution:

This is an informational message, generally no user action is needed. If all the nodes in the cluster are hanging at this message for a long time, look for other messages. The possible cause is the cluster hasn't obtained quorum, or there is CCR metadata missing or invalid. If the cluster is hanging due to missing or invalid metadata, the ccr metadata needs to be recovered from backup.

488276 :in libsecurity write of file %s failed: %s

Description:

The rpc.pmfd, rpc.fed or rgmd server was unable to write to a cache file for rpcbind information. The affected component should continue to function by calling rpcbind directly.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.

489438 :clcomm: Path %s being drained

Description:

A communication link is being removed with another node. The interconnect may have failed or the remote node may be down.

Solution:

Any interconnect failure should be resolved, and/or the failed node rebooted.

489644 :Could not look up host because host was NULL.

Description:

Can't look up the hostname locally in hostfile. The specified host name is invalid.

Solution:

Check whether the hostname has NULL value. If this is the case, recreate the resource with valid host name. If this is not the reason, treat it as an internal error and contact Sun service provider.

491081 :resource %s removed.

Description:

This is a notification from the rgmd that the operator has deleted a resource. This may be used by system monitoring tools.

Solution:

This is an informational message, no user action is needed.

491579 :clcomm: validate_policy: fixed size pool low %d must match moderate %d

Description:

The system checks the proposed flow control policy parameters at system startup and when processing a change request. The low and moderate server thread levels must be the same for fixed size resource pools.

Solution:

No user action required.

491694 :Could not %s any IP addresses.

Description:

The specified action was not successful for all IP addresses managed by the LogicalHostname resource.

Solution:

Check the logs for any error messages from pnm. This could be result from the lack of system resources, such as low on memory. Reboot the node if the problem persists.

491738 ::Local node failed to do affinity switchover to global service <%s> of path <%s>:%s

Description:

When prenet_start method of SUNW.HAStorage attempted an affinity switch, it failed.

Solution:

The affinity switchover may have failed due to an equivalent switchover having been in progress at the time. The service may indeed have successfully come online later during boot. Use the scstat (1M) -g command to verify service availability and scstat(1M) -D to identify primary server. If the service does not reflect expected configuration, retry the affinity switchover via scswitch(1M).

492603 :launch_fed_prog: fe_method_full_name() failed for program <%s>

Description:

The ucmmd was unable to assemble the full method pathname for the fed program to be launched. This is considered a launch_fed_prog failure.

Solution:

Examine other syslog messages occurring at about the same time to see if the problem can be identified. Save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance in diagnosing and correcting the problem.

492953 :ORACLE_HOME/bin/lsnrctl not found ORACLE_HOME=%s

Description:

Oracle listener binaries not found under ORACLE_HOME. ORACLE_HOME specified for the resource is indicated in the message. HA-Oracle will not be able to manage Oracle listener if ORACLE_HOME is incorrect.

Solution:

Specify correct ORACLE_HOME when creating resource. If resource is already created, please update resource property 'ORACLE_HOME'.

494534 :clcomm: per node IP config %s%d:%d (%d): %d.%d.%d.%d failed with %d

Description:

The system failed to configure IP communications across the private interconnect of this device and IP address, resulting in the error identified in the message. This happened during initialization. Someone has used the "lo0:1" device before the system could configure it.

Solution:

If you used "lo0:1", please use another device. Otherwise, Contact your authorized Sun service provider to determine whether a workaround or patch is available.

494913 :pmfd: unknown action (0x%x)

Description:

An internal error has occurred in the rpc.pmfd server. This should not happen.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.

495284 :dl_attach: DLPI error %u

Description:

Could not attach to the physical device. We are trying to open a fast path to the private transport adapters.

Solution:

Reboot of the node might fix the problem.

495386 :INTERNAL ERROR: %s.

Description:

An internal error has occurred.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.

495529 :Prog <%s> failed to execute step <%s> - <%s>

Description:

ucmmd failed to execute a step.

Solution:

Examine other syslog messages occurring at about the same time to see if the problem can be identified and if it recurs. Save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance.

495710 :Stopping oracle server using shutdown immediate

Description:

Informational message. Oracle server will be stopped using 'shutdown immediate' command.

Solution:

None

496884 :Despite the warnings, the validation of the hostname list succeeded.

Description:

While validating the hostname list, non fatal errors have been found.

Solution:

This is informational message. It is suggested to correct the errors if applicable. For the error information, check the syslog messages that have been encountered before this message.

496746 :reservation error(%s) - USCSI_RESET failed for device %s, returned %d.

Description:

The device fencing program has encountered errors while trying to access a device. All retry attempts have failed.

Solution:

This may be indicative of a hardware problem, which should be resolved as soon as possible. Once the problem has been resolved, the following actions may be necessary: If the message specifies the 'node_join' transition, then this node may be unable to access the specified device. If the failure occurred during the 'release_shared_scsi2' transition, then a node which was joining the cluster may be unable to access the device. In either case, access can be reacquired by executing '/usr/cluster/lib/sc/run_reserve -c node_join' on all cluster nodes. If the failure occurred during the 'make_primary' transition, then a device group may have failed to start on this node. If the device group was started on another node, it may be moved to this node with the scswitch command. If the device group was not started, it may be started with the scswitch command. If the failure occurred during the 'primary_to_secondary' transition, then the shutdown or switchover of a device group may have failed. If so, the desired action may be retried.

496991:BV Config Error:IMs not configured on either the physical or the private interconnect.

Description:

The Interaction Managers are not configured on either the physical node or on the private node.

Solution:

Reconfigure the Interaction Managers on a physical host or on a cluster private IP. Refer to your Sun Cluster HA for BroadVision One-To-One Enterprise documentation.

497795 :gethostbyname() timed out.

Description:

The name service could be unavailable.

Solution:

If the cluster is under load or too much network traffic, increase the timeout value of monitor_check method using scrgadm command. Otherwise, check if name service is configured correctly. Try some commands to query name serves, such as ping and nslookup, and correct the problem. If the error still persists, then reboot the node.

497808 :clcomm: Cannot fork() after cluster initialization

Description:

A user level process attempted to fork after cluster initialization. This is not allowed.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.

498582 :Attempt to load %s failed: %s.

Description:

A shared address resource was in the process of being created. In order to prepare this node to handle scalable services, the specified kernel module was attempted to be loaded into the system, but failed.

Solution:

This might be the result from the lack of system resources. Check whether the system is low in memory and take appropriate action (e.g., by killing hung processes). For specific information check the syslog message. After more resources are available on the system, attempt to create shared address resource. If problem persists, reboot.

498711 :Could not initialize the ORB. Exiting.

Description:

clexecd program was unable to initialize its interface to the low-level clustering software.

Solution:

This might occur because the operator has attempted to start clexecd program on a node that is booted in non-cluster mode. If the node is in non-cluster mode, boot it into cluster mode. If the node is already in cluster mode, contact your authorized Sun service provider to determine whether a workaround or patch is available.

499486 :Unable to set socket flags: %s.

Description:

Failed to set the non-blocking flag for the socket used in communicating with the application.

Solution:

This is an internal error, no user action is required. Also contact your authorized Sun service provider.

499756 :CMM: Node %s: joined cluster.

Description:

The specified node has joined the cluster.

Solution:

This is an informational message, no user action is needed.

499775 :resource group %s added.

Description:

This is a notification from the rgmd that a new resource group has been added. This may be used by system monitoring tools.

Solution:

This is an informational message, no user action is needed.

499802:Successfully started BV daemons on $HOSTNAME.

Description:

The Sun Cluster HA for BroadVision One-To-One Enterprise processes on the specified host successfully started.

Solution:

No user action required.