Sun Cluster 3.0 5/02 Error Messages Guide

Error Message List

The following list is ordered by the message ID.

700161 :Fault monitor is already running.

Description:

The resource's fault monitor is already running.

Solution:

This is an internal error. Save the /var/adm/messages file from all the nodes. Contact your authorized Sun service provider.

700321 :exec() of %s failed: %m.

Description:

The exec() system call failed for the given reason.

Solution:

Verify that the pathname given is valid.

701136 :Failed to stop monitor server.

Description:

Sun Cluster HA for Sybase failed to stop monitor server using KILL signal.

Solution:

Please examine whether any Sybase server processes are running on the server. Please manually shutdown the server.

702368 :Failed to register callback for NAFO group %s with tag %s and callback command %s (request failed with %d).

Description:

An unexpected error occurred while trying to communicate with the network monitoring daemon (pnmd).

Solution:

Make sure the network monitoring daemon (pnmd) is running.

703476 :clcomm: unable to create desired unref threads

Description:

The system was unable to create thread that deal with no longer needed objects. The system fails to create threads when memory is not available. This message can be generated by the inability of either the kernel or a user level process. The kernel creates unref threads when the cluster starts. A user level process creates threads when it initializes.

Solution:

Take steps to increase memory availability. The installation of more memory will avoid the problem with a kernel inability to create threads. For a user level process problem: install more memory, increase swap space, or reduce the peak work load.

703553 :Resource group name or resource name is too long.

Description:

Process monitor facility is failed to execute the command. Resource group name or resource name is too long for the process monitor facility command.

Solution:

Check the resource group name and resource name. Give short name for resource group or resource.

703744 :reservation fatal error(%) - get_cluster_state() exception.

Description:

The device fencing program has suffered an internal error.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available. Copies of /var/adm/messages from all nodes should be provided for diagnosis. It may be possible to retry the failed operation, depending on the nature of the error. If the message specifies the `node_join' transition, then this node may be unable to access shared devices. If the failure occurred during the `release_shared_scsi2' transition, then a node which was joining the cluster may be unable to access shared devices. In either case, it may be possible to reacquire access to shared devices by executing `/usr/cluster/lib/sc/run_reserve -c node_join' on all cluster nodes. If the failure occurred during the `make_primary' transition, then a group has failed to start on this node. If another node was available to host the device group, then it should have been started on that node. If desired, it may be possible to switch the device group to this node with the scswitch command. If no other node was available, then the device group will not have been started. The scswitch command may be used to retry the attempt to start the device group. If the failure occurred during the `primary_to_secondary' transition, then the shutdown or switchover of a device group has failed. The desired action may be retried.

704795 :in libsecurity could not negotiate uid on any transport in NETPATH

Description:

A server (rpc.pmfd, rpc.fed or rgmd) was not able to start because it could not establish a rpc connection for the network specified, because it couldn't find any transport. This happened because either there are no available transports at all, or there are but none is a loopback. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.

705163 :load balancer thread failed to start for %s

Description:

The system has run out of resources that is required to create a thread. The system could not create the load balancer thread.

Solution:

The service group is created with the default load balancing policy. If rebalancing is required, free up resources by shutting down some processes. Then delete the service group and re-create it.

705308:Could not start the monitor server.

Description:

Sun Cluster HA for Sybase failed to start the Monitor Server. Other syslog messages and the log file should provide additional information on possible reasons for the failure.

Solution:

Manually start the Monitor Server. Examine the log files and setup. See if the START method timeout value is set too low.

705379 :resource %s state on node %s change to R_ONLINE_UNMON

Description:

This is a notification from the rgmd that a resource has changed state to online-not-monitored on the given node. This may be used by system monitoring tools.

Solution:

This is an informational message, no user action is needed.

705629 :clutil: Can't allocate hash table

Description:

The system attempted unsuccessfully to allocate a hash table. There was insufficient memory.

Solution:

Install more memory, increase swap space, or reduce peak memory consumption.

706159 :Failed to switchover resource group %s: %s

Description:

An attempt to switchover the specified resource group failed. The reason for the failure is logged.

Solution:

Look for the message indicating the reason for this failure. This should help in the diagnosis of the problem.

706314 :clexecd: Error %d from open(/dev/zero). Exiting.

Description:

clexecd program has encountered a failed open(2) system call. The error message indicates the error number for the failure.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.

707301 :lkcm_reg: Unix DLM version (???) and the OSD library version (%d) are not compatible. Unix DLM versions acceptable to this library are: %d

Description:

UNIX DLM and Oracle DLM are not compatible. Compatible versions will be printed as part of this message.

Solution:

Check installation procedure to make sure you have the correct versions of Oracle DLM and Unix DLM. Contact Sun service representative if versions cannot be resolved.

707421 :%s: Cannot create a thread.

Description:

Solaris has run out of its limit on threads. Either too many clients are requesting a service, causing many threads to be created at once or system is overloaded with processes.

Solution:

Reduce system load by reducing number of requestors of this service or halting other processes on the system.

707881 :clcomm: thread_create failed for autom_thread

Description:

The system could not create the needed thread, because there is inadequate memory.

Solution:

There are two possible solutions. Install more memory. Alternatively, reduce memory usage. Since this happens during system startup, application memory usage is normally not a factor.

707948 :launching method <%s> for resource <%s>, resource group <%s>, timeout <%d> seconds

Description:

RGM has invoked a callback method for the named resource, as a result of a cluster reconfiguration, scha_control GIVEOVER, or scswitch.

Solution:

This is an informational message, no user action is needed.

709694:Cannot remove file %s/%s.krg.

Description:

The Sybase Adaptive Server uses the $ADPSERVER_SHM_DIR/$ADAPTIVE_SERVER_NAME.krg file to store information about Solaris IPC objects. Graceful shutdowns result in automatic deletion of this file. Sun Cluster HA for Sybase attempts to remove this file prior to the Sybase Adaptive Server startup/shutdown process and logs this error message if it encounters an error.

Solution:

Remove the file using the root account, if necessary.

710143 :Failed to add node %d to scalable service group %s: %s.

Description:

A call to the underlying scalable networking code failed.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.

711956 :open /dev/ip failed: %s.

Description:

System was attempting to open the specified device, but was unable to do so.

Solution:

This might be the result of lack of the system resources. Check whether the system is low in memory and take appropriate action. For specific error information check the syslog message.

712367 :clcomm: Endpoint %p: deferred task not allowed in state %d

Description:

The system maintains information about the state of an Endpoint. A deferred task is not allowed in this state.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.

714123:Stopping the backup server.

Description:

Sun Cluster HA for Sybase is shutting down the Backup Server.

Solution:

No user action required.

714838 :reservation fatal error(%s) - Unable to open name file '%s', errno %d

Description:

The device fencing program has suffered an internal error.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available. Copies of /var/adm/messages from all nodes should be provided for diagnosis. It may be possible to retry the failed operation, depending on the nature of the error. If the message specifies the 'node_join' transition, then this node may be unable to access shared devices. If the failure occurred during the 'release_shared_scsi2' transition, then a node which was joining the cluster may be unable to access shared devices. In either case, it may be possible to reacquire access to shared devices by executing '/usr/cluster/lib/sc/run_reserve -c node_join' on all cluster nodes. If the failure occurred during the 'make_primary' transition, then a device group has failed to start on this node. If another node was available to host the device group, then it should have been started on that node. If desired, it may be possible to switch the device group to this node with the scswitch command. If no other node was available, then the device group will not have been started. The scswitch command may be used to retry the attempt to start the device group. If the failure occurred during the 'primary_to_secondary' transition, then the shutdown or switchover of a device group has failed. The desired action may be retried.

715958 :Method <%s> on resource <%s> stopped due to receipt of signal <%d>

Description:

A resource method was stopped by a signal, most likely resulting from an operator-issued kill(1). The method is considered to have failed.

Solution:

The operator must kill the stopped method. The operator may then choose to issue an scswitch(1M) command to bring resource groups onto desired primaries, or re-try the administrative action that was interrupted by the method failure.

716023:BV1TO1 variable not set.

Description:

The BV1TO1 variable is not configured in the bv1to1.conf file.

Solution:

Reconfigure the Sun Cluster HA for BroadVision One-To-One Enterprise site with proper BV1TO1 value.

716253 :launch_fed_prog: fe_set_env_vars() failed for program <%s>, step <%s>

Description:

The ucmmd server was not able to get the locale environment. An error message is output to syslog.

Solution:

Investigate if the host is running out of memory. If not save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.

717089 :NAFO group %s has unknown status %d. Skipping this NAFO group.

Description:

The status of the NAFO group is not among the set of statuses that is known.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.

718325:Failed to stop development system within %d seconds. Will continue to stop the development system in the background. Meanwhile, the production system Central Instance is started up now.

Description:

Failed to shutdown the development system within the timeout period. It will be continuously shutting down in the background. Meanwhile, the Central instance will be started up.

Solution:

No action needed. You might consider increasing the Dev_stop_pct property or Start_timeout property.

718457:Dispatcher Process is not running. pid was %d.

Description:

The main dispatcher process is not present in the process list, indicating the main dispatcher is not running on this node.

Solution:

No user action required. The fault monitor should detect that the main dispatcher process is not running, and take appropriate action

719114 :Failed to parse key/value pair from command line for %s.

Description:

The validate method for the scalable resource network configuration code was unable to convert the property information given to a usable format.

Solution:

Verify the property information was properly set when configuring the resource.

719497 :clcomm: path_manager using RT lwp rather than clock interrupt

Description:

The system has been built to use a real time thread to support path_manager heart beats instead of the clock interrupt.

Solution:

No user action is required.

719735 :resource %s state on node %s change to R_PENDING_BOOT

Description:

This is a notification from the rgmd that the BOOT method is running on the given resource, on a node that has recently joined the cluster. This may be used by system monitoring tools.

Solution:

This is an informational message, no user action is needed.

719997 :Failed to pre-allocate swap space

Description:

The pmfd, fed, or other program was not able to allocate swap space. This means that the machine is low in swap space. The server does not come up, and an error message is output to syslog.

Solution:

Investigate if the machine is running out of swap. If this is not the case, save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.

721252 :cm2udlm: cm_getclustmbyname: %s

Description:

Could not create a structure for communication with the cluster monitor process.

Solution:

This is an internal error. Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.

721263:Extension property <stop_signal> has a value of <%d>

Description:

Resource property stop_signal is set to a value or has a default value.

Solution:

This is an informational message, no user action is needed.

721881:dl_attach: kstr_msg failed %d error

Description:

Could not attach to the private interconnect.

Solution:

Reboot of the node might fix the problem.

722270 :fatal: cannot create state machine thread

Description:

The rgmd was unable to create a thread upon starting up. This is a fatal error. The rgmd will produce a core file and will force the node to halt or reboot to avoid the possibility of data corruption.

Solution:

Make sure that the hardware configuration meets documented minimum requirements. Save a copy of the /var/adm/messages files on all nodes, and of the rgmd core file. Contact your authorized Sun service provider for assistance in diagnosing the problem.

722904 :Failed to open the resource group handle: %s.

Description:

An API operation has failed while retrieving the resource group property. Low memory or API call failure might be the reasons.

Solution:

In case of low memory, the problem will probably cured by rebooting. If the problem reoccurs, you might need to increase swap space by configuring additional swap devices. Otherwise, if it is API call failure, check the syslog messages from other components. For resource group name and the property name, check the current syslog message.

722984 :call to rpc.fed failed for resource <%s>, resource group <%s>, method <%s>

Description:

The rgmd failed in an attempt to execute a method, due to a failure to communicate with the rpc.fed daemon. Depending on which method was being invoked and the Failover_mode setting on the resource, this might cause the resource group to fail over or move to an error state. If the rpc.fed process died, this might lead to a subsequent reboot of the node.

Solution:

Examine other syslog messages occurring at about the same time to see if the problem can be identified. Save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance in diagnosing the problem.

723206:SAP is already running.

Description:

Sun Cluster HA for SAP is already running either locally on this node or remotely on a different node in the cluster outside of the control of the Sun Cluster software.

Solution:

Shut down Sun Cluster HA for SAP before starting Sun Cluster HA for SAP under the control of the Sun Cluster software.

724035 :Failed to connect to %s secure port %d.

Description:

An error occurred while the fault monitor was trying to connect to a secure port specified in the Port_list property for this resource.

Solution:

Check to make sure that the Port_list property is correctly set to the same port number that the Netscape Directory Server is running on.

726004 :Invalid timeout value %d passed.

Description:

Failed to execute the command under the specified timeout. The specified timeout is invalid

Solution:

Respecify a positive, non-zero timeout value.

726417 :read %d for %sport

Description:

Could not get the port information from config file udlm.conf.

Solution:

Check to make sure udlm.conf file exist and has entry for udlm.port. If everything looks normal and the problem persists, contact your Sun service representative.

727065 :CMM: Enabling failfast on quorum device %s failed with error %d.

Description:

An attempt to enable failfast on the specified quorum device failed with the specified error.

Solution:

Check if the specified quorum disk has failed. This message may also be logged when a node is booting up and has been preempted from the cluster, in which case no user action is necessary.

727160 :msg of wrong version %d, expected %d

Description:

udlmctl received an illegal message.

Solution:

None. udlm will handle this error.

727317 :Entry in %s for file system mount point %s incorrect. %s.

Description:

The file system entry for a file system mount point in /etc/vfstab is found to be inconsistent. Local file systems managed by HAStoragePlus should not be specified as a PxFS file system and should have the 'mount at boot' flag to be 'no'. HAStoragePlus's validate and start methods inspect the /etc/vfstab file for inconsistencies.

Solution:

Ensure that file system entries are specified correctly.

728216 :reservation error(%s) - did_get_path() error

Description:

The device fencing program has suffered an internal error.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available. Copies of /var/adm/messages from all nodes should be provided for diagnosis. It may be possible to retry the failed operation, depending on the nature of the error. If the message specifies the 'node_join' transition, then this node may be unable to access shared devices. If the failure occurred during the 'release_shared_scsi2' transition, then a node which was joining the cluster may be unable to access shared devices. In either case, it may be possible to reacquire access to shared devices by executing '/usr/cluster/lib/sc/run_reserve -c node_join' on all cluster nodes. If the failure occurred during the 'make_primary' transition, then a device group has failed to start on this node. If another node was available to host the device group, then it should have been started on that node. If desired, it may be possible to switch the device group to this node with the scswitch command. If no other node was available, then the device group will not have been started. The scswitch command may be used to retry the attempt to start the device group. If the failure occurred during the 'primary_to_secondary' transition, then the shutdown or switchover of a device group has failed. The desired action may be retried.

728425 :INTERNAL ERROR: bad state <%s> (%d) for resource group <%s> in rebalance()

Description:

An internal error has occurred in the rgmd. This may prevent the rgmd from bringing the affected resource group online.

Solution:

Look for other syslog error messages on the same node. Save a copy of the /var/adm/messages files on all nodes, and report the problem to your authorized Sun service provider.

728881 :Failed to read data: %s.

Description:

Failed to read the data from the socket. The reason might be expiration of timeout, hung application, or heavy load.

Solution:

Check if the application is hung. If this is the case, restart the application.

728928 :CCR: Can't access table %s on node %s errno = %d.

Description:

The indicated error occurred when CCR was tried to access the indicated table on the nodes in the cluster. The errno value indicates the nature of the problem. errno values are defined in the file /usr/include/sys/errno.h. An errno value of 28(ENOSPC) indicates that the root files system on the node is full. Other values of errno can be returned when the root disk has failed(EIO).

Solution:

There may be other related messages on the node where the failure occurred. They may help diagnose the problem. If the root file system is full on the node, then free up some space by removing unnecessary files. If the root disk on the afflicted node has failed, then it needs to be replaced. If the indicated table was accidently removed, boot the indicated node in -x mode to restore the indicated table from backup. The CCR tables are located at /etc/cluster/ccr/.

729152 :clexecd: Error %d from F_SETFD. Exiting.

Description:

clexecd program has encountered a failed fcntl(2) system call. The error message indicates the error number for the failure.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.

730190 :scvxvmlg error - found non device-node or non link %s, directory not removed

Description:

The program responsible for maintaining the VxVM namespace had detected suspicious entries in the global device namespace.

Solution:

The global device namespace should only contain diskgroup directories and volume device nodes for registered diskgroups. The specified path was not recognized as either of these and should be removed from the global device namespace.

730685 :PCSTATUS: %s

Description:

The rpc.pmfd server was not able to monitor a process, and the system error is shown. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.

730782 :Failed to update scalable service: Error %d.

Description:

Update to a property related to scalability was not successfully applied to the system.

Solution:

Use scswitch to try to bring resource offline and online again on this node. If the error persists, reboot the node and contact your Sun service representative.

730956 :%d entries found in property %s. For a nonsecure Netscape Directory Server instance %s should have exactly one entry.

Description:

Since a nonsecure Netscape Directory Server instance only listens on a single port, the list property should only have a single entry. A different number of entries was found.

Solution:

Change the number of entries to be exactly one.

732069 :dl_attach: DL_ERROR_ACK protocol error

Description:

Could not attach to the physical device.

Solution:

Check the documentation for the driver associated with the private interconnect. It might be that the message returned is too small to be valid.

732569:reservation error(%s) error. Not found clexecd on node %d.

Description:

The device fencing code was unable to communicate with another node.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available. Copies of /var/adm/messages from all nodes should be provided for diagnosis. It may be possible to retry the failed operation, depending on the nature of the error. If the message specifies the `node_join' transition, then this node may be unable to access shared devices. If the failure occurred during the `release_shared_scsi2' transition, then a node which was joining the cluster may be unable to access shared devices. In either case, it may be possible to reacquire access to shared devices by executing `/usr/cluster/lib/sc/run_reserve -c node_join' on all cluster nodes. If the failure occurred during the `make_primary' transition, then a device group has failed to start on this node. If another node was available to host the device group, then it should have been started on that node. If desired, it may be possible to switch the device group to this node with the scswitch command. If no other node was available, then the device group will not have been started. The scswitch command may be used to retry the attempt to start the device group. If the failure occurred during the `primary_to_secondary' transition, then the shutdown or switchover of a device group has failed. The desired action may be retried.

732787:bv1to1.conf.sh file is not found in the %s/etc directory.

Description:

The bv1to1.conf.sh file is not accessible.

Solution:

Verify that the file exists in $BV1TO1_VAR/etc/bv1to1.conf.sh. If the file exists in this directory, ensure that the BV1TO1_VAR extension property is correctly set.

732822 :clconf: Invalid group name

Description:

An invalid group name has been encountered while converting a group name to clconf_obj type. Valid group names are "cluster", "nodes", "adapters", "ports", "blackboxes", "cables", and "quorum_devices".

Solution:

This is an unrecoverable error, and the cluster needs to be rebooted. Also contact your authorized Sun service provider to determine whether a workaround or patch is available.

733367 :lkcm_act: %s: %s cm_reconfigure failed

Description:

ucmm reconfiguration failed.

Solution:

None if the next reconfiguration succeeds. If not, save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.

734057 :clcomm: Duplicate TypeId's: %s : %s

Description:

The system records type identifiers for multiple kinds of type data. The system checks for type identifiers when loading type information. This message identifies two items having the same type identifiers. This checking only occurs on debug systems.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.

734173:No hosts are configured in bv1to1.conf.

Description:

There are no hosts configured to run Sun Cluster HA for BroadVision One-To-One Enterprise processes.

Solution:

Configure the Sun Cluster HA for BroadVision One-To-One Enterprise hosts in bv1to1.conf file.

734832 :clutil: Created insufficient threads in threadpool

Description:

There was insufficient memory to create the desired number of threads.

Solution:

Install more memory, increase swap space, or reduce peak memory consumption.

734890 :pthread_detach: %s

Description:

The rpc.pmfd server was not able to detach a thread, possibly due to low memory. The message contains the system error. The server does not perform the action requested by the client, and an error message is output to syslog.

Solution:

Investigate if the machine is running out of memory. If all looks correct, save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.

736937:Could not start the backup server.

Description:

Sun Cluster HA for Sybase failed to start the Backup Server. Other syslog messages and log file should provide additional information on possible reasons for the failure.

Solution:

Manually start the Backup Server. Examine the log files and setup. See if the START method timeout value is set too low.

737104 :Received unexpected result <%d> from rpc.fed, aborting node

Description:

This node encountered an unexpected error while communicating with other cluster nodes during a cluster reconfiguration. The ucmmd will produce a core file and will cause the node to halt or reboot.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.

737125 :INTERNAL ERROR: PENDING_OFF_STOP_FAILED or ERROR_STOP_FAILED in rebalance()

Description:

An internal error has occurred in the rgmd. This may prevent the rgmd from bringing affected resource groups online.

Solution:

Look for other syslog error messages on the same node. Save a copy of the /var/adm/messages files on all nodes, and report the problem to your authorized Sun service provider.

737445 :rebalance: not attempting to start resource group <%s> on node <%s> because this resource group has already failed to start on this node %d or more times in the past %d seconds

Description:

The rgmd is preventing "ping-pong" failover of the resource group, i.e., repeated failover of the resource group between two or more nodes.

Solution:

The time interval in seconds that is mentioned in the message can be adjusted by using scrgadm(1M) to set the Pingpong_interval property of the resource group.

737572 :PMF error when starting Sybase %s: %s. Error: %s

Description:

Sun Cluster HA for Sybase failed to start sybase server using Process Monitoring Facility (PMF). Other syslog messages and the log file will provide additional information on possible reasons for the failure.

Solution:

Determine whether the server can be started manually. Examine the HA-Sybase log files, sybase log files and setup.

738465 :Malformed adapter specification %s.

Description:

Failed to retrieve the NAFO information. The given adapter specification is invalid.

Solution:

Check whether the adapter specification is in the form of nafogroup@nodename. If not, recreate the resource with the properly formatted adapter information.

738847 :clexecd: unable to create failfast object.

Description:

clexecd problem could not enable one of the mechanisms which causes the node to be shutdown to prevent data corruption, when clexecd program dies.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.

739356 :warning: cannot store start_failed timestamp for resource group <%s>: time() failed, errno <%d> (%s)

Description:

The specified resource group failed to come online on some node, but this node is unable to record that fact due to the failure of the time(2) system call. The consequence of this is that the resource group may continue to pingpong between nodes for longer than the Pingpong_interval property setting.

Solution:

Examine other syslog messages occurring around the same time on the same node, to see if the cause of the problem can be identified. If the same error recurs, you might have to reboot the affected node.

739653 :Port number %d is listed twice in property %s, at entries %d and %d.

Description:

The port number in the message was listed twice in the named property, at the list entry locations given in the message. A port number should only appear once in the property.

Solution:

Specify the property with only one occurrence of the port number.

740373 :Failed to get the scalable service related properties for resource %s.

Description:

An unexpected error occurred while trying to collect the properties related to scalable networking for the named resource.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.

740972 :in fe_set_env_vars setlocale failed

Description:

The rgmd server was not able to get the locale environment, while trying to connect to the rpc.fed server. An error message is output to syslog.

Solution:

Investigate if the host is running out of memory. If not save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.

742337 :Node %d is in the %s for resource %s, but property %s identifies resource %s which cannot host an address on node %d.

Description:

All IP addresses used by this resource must be configured to be available on all nodes that the scalable resource can run on.

Solution:

Either change the resource group nodelist to exclude the nodes that cannot host the SharedAddress IP address, or select a different network resource whose IP address will be available on all nodes where this scalable resource can run.

743362 :could not read failfast mode, using panic

Description:

/opt/SUNWudlm/etc/udlm.conf did not have an entry for failfast mode. Default mode of 'panic' will be used.

Solution:

None.

744837:No executable $BV1TO1/bin/bvconf.

Description:

The specified executable was not found.

Solution:

Verify that Sun Cluster HA for BroadVision One-To-One Enterprise was properly installed. Ensure that the specified executable is in the correct location.

744866:Failed to check status of SUNW.HAStoragePlus resource.

Description:

An error occured while checking the status of the SUNW.HAStoragePlus resource that this resource depends on.

Solution:

Check syslog messages and correct the problems specified in prior syslog messages. If the error still persists, report this problem to yoru Sun service provider.

745100 :CMM: Quorum device %ld has been changed from %s to %s.

Description:

The name of the specified quorum device has been changed as indicated. This can happen if while this node was down, the previous quorum device was removed from the cluster, and a new one was added (and assigned the same id as the old one) to the cluster.

Solution:

This is an informational message, no user action is needed.

747567 :Unable to complete any share commands.

Description:

None of the paths specified in the dfstab. file were shared successfully.

Solution:

The prenet_start method would fail. Sun Cluster resource management would attempt to bring the resource on-line on some other node. Manually check that the paths specified in the dfstab. file are correct.

749409 :clcomm: validate_policy: high not enough. high %d low %d in c %d nodes %d pool %d

Description:

The system checks the proposed flow control policy parameters at system startup and when processing a change request. For a variable size resource pool, the high server thread level must be large enough to allow all of the nodes identified in the message join the cluster and receive a minimal number of server threads.

Solution:

No user action required.

749958 :CMM: Unable to create %s thread.

Description:

The CMM was unable to create its specified thread and the system can not continue. This is caused by inadequate memory on the system.

Solution:

Add more memory to the system. If that does not resolve the problem, contact your authorized Sun service provider to determine whether a workaround or patch is available.

751258 :TCPTR: Node %u attempting to join cluster has incompatible cluster software. \"%s\" on node %u not compatible with \"%s\" on node %u.

Description:

Transport at the local node received an initial handshake message from the remote node that it is not running a compatible version of the cluster software.

Solution:

Make sure all nodes in the cluster are running compatible versions of sun cluster software.

751715 :Global service %s associated with path %s is found to be in the maintenence state.

Description:

A global device service is detected to be in the maintenence state. This is treated as an error.

Solution:

Inspect the syslog for errors. Contact your authorized Sun service provider to determine whether a workaround or patch is available.

751934 :scswitch: rgm_change_mastery() failed with NOREF, UNKNOWN, or invalid error on node %d

Description:

An inter-node communication failed with an unknown exception while the rgmd was attempting to execute an operator-requested switch of the primaries of a resource group, or was attempting to "fail back" a resource group onto a node that just rejoined the cluster. This will cause the attempted switching action to fail.

Solution:

Examine other syslog messages occurring around the same time on the same node, to see if the cause of the problem can be identified. If the switch was operator-requested, retry it. If the same error recurs, you might have to reboot the affected node. Since this problem might indicate an internal logic error in the clustering software, please save a copy of the /var/adm/messages files on all nodes, the output of an scstat -g command, and the output of a scrgadm -pvv command. Report the problem to your authorized Sun service provider.

753155:Starting fault monitor. pmf tag %s.

Description:

The fault monitor is being started under control of the Process Monitor Facility (PMF), with the tag indicated in the message.

Solution:

No user action required.

753429 :This node has a lower preference compared to node %d for global service %s associated with path %s. Device switchover can still be done to this node.

Description:

This is an informational message.

Solution:

This is an informational message, no user action is needed.

753898 :resource group %s state on node %s change to RG_OFF_PENDING_BOOT

Description:

This is a notification from the rgmd that BOOT methods are running on resources in the given resource group, on a node that has recently joined the cluster. This may be used by system monitoring tools.

Solution:

This is an informational message, no user action is needed.

754521 :Property %s does not have a value. This property must have exactly one value.

Description:

The property named does not have a value specified for it.

Solution:

Set the property to have exactly one value.

754848 :The property %s must contain at least one SharedAddress network resource.

Description:

The named property must contain at least one SharedAddress.

Solution:

Specify a SharedAddress resource for this property.

756369 :NAFO group %s has failed, so scalable resource %s in resource group %s may not be able to respond to client requests. A request will be issued to relocate resource %s off of this node.

Description:

The named NAFO group has failed, so the node may not be able to respond to client requests. It would be desirable to move the resource to another node that has functioning NAFO groups. A request will be issued on behalf of this resource to relocate the resource to another node.

Solution:

Check the status of the NAFO group on the node. Try to fix the adapters in the NAFO group.

756650 :Failed to set the global interface node to %d for IP %s: %s.

Description:

A call to the underlying scalable networking code failed.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.

757581 :Failed to stop daemon %s.

Description:

The HA-NFS implementation was unable to stop the specified daemon.

Solution:

The resource could be in a STOP_FAILED state. If the failover mode is set to HARD, the node would get automatically rebooted by the Sun Cluster resource management. If the Failover_mode is set to SOFT or NONE, please check that the specified daemon is indeed stopped (by killing it by hand, if necessary). Then clear the STOP_FAILED status on the resource and bring it on-line again using the scswitch command.

757758 :scvxvmlg error - getminor called with a bad filename: %s

Description:

The program responsible for maintaining the VxVM namespace has suffered an internal error. If configuration changes were recently made to VxVM diskgroups or volumes, this node may be unaware of those changes. Recently created volumes may be unaccessible from this node.

Solution:

If no configuration changes have been recently made to VxVM diskgroupsor volumes and all volumes continue to be accessible from this node, then no action is required. If changes have been made, the device namespace on this node can be updated to reflect those changes by executing '/usr/cluster/lib/dcs/scvxvmlg'. If the problem persists, contact yourauthorized Sun service provider to determine whether a workaround or patch is available.

760001 :(%s) netconf error: cannot get transport info for 'ticlts' %s

Description:

Call to getnetconfigent failed and udlmctl could not get network information. udlmctl will exit.

Solution:

Make sure the intern connect does not have any problems. Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.

761076 :dl_bind: DL_ERROR_ACK protocol error

Description:

Could not bind to the physical device. We are trying to open a fast path to the private transport adapters.

Solution:

Reboot of the node might fix the problem.

762902 :Failed to restart fault monitor.

Description:

The resource property that was updated needed the fault monitor to be restarted in order for the change to take effect, but the attempt to restart the fault monitor failed.

Solution:

Look at the prior syslog messages for specific problems. Correct the errors if possible. Look for the process _probe operating on the desired resource (indicated by the argument to "-R" option). This can be found from the command: ps -ef | egrep _probe | grep "\-R " Send a kill signal to this process. If the process does not get killed and restarted by the process monitor facility, reboot the node.

763929 :HA: rm_service_thread_create failed

Description:

The system could not create the needed thread, because there is inadequate memory.

Solution:

There are two possible solutions. Install more memory. Alternatively, reduce memory usage.

765395 :clcomm: RT class not configured in this system

Description:

Sun Cluster requires that the real time thread scheduling class be configured in the kernel.

Solution:

Configure Solaris with the RT thread scheduling class in the kernel.

763781 :For global service <%s> of path <%s>, local node is less preferred than node <%d>. But affinity switch over may still be done.

Description:

A service is switched to a less preferred node due to affinityswitchover of SUNW.HAStorage prenet_start method.

Solution:

Check which configuration can gain more performance benefit, either to leave the service on its most preferred node or let the affinity switchover take effect. Using scswitch(1m) to switch it back if necessary.

766093 :IP address (hostname) and Port pairs %s%c%d%c%s and %s%c%d%c%s in property %s, at entries %d and %d, effectively duplicate each other. The port numbers are the same and the resolved IP addresses are the same.

Description:

The two list entries at the named locations in the named property have port numbers that are identical, and also have IP address (hostname) strings that resolve to the same underlying IP address. An IP address (hostname) string and port entry should only appear once in the property.

Solution:

Specify the property with only one occurrence of the IP address (hostname) string and port entry.

766316:Started saposcol process under PMF successfully.

Description:

The OS collector process successfully started under the control of the Process Monitor Facility (PMF).

Solution:

No user action required.

767363 :CMM: Disconnected from node %ld; aborting using %s rule.

Description:

Due to a connection failure between the local and the specified node, the local node must be halted to avoid a "split brain" configuration. The CMM used the specified rule to decide which node to fail. Rules are: rebootee: If one node is rebooting and the other was a member of the cluster, the node that is rebooting must abort. quorum: The node with greater control of quorum device votes survives and the other node aborts. node number: The node with higher node number aborts.

Solution:

The cause of the failure should be resolved and the node should be rebooted if node failure is unexpected.

767488 :reservation fatal error(UNKNOWN) - Command not specified

Description:

The device fencing program has suffered an internal error.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available. Copies of /var/adm/messages from all nodes should be provided for diagnosis. It may be possible to retry the failed operation, depending on the nature of the error. If the message specifies the 'node_join' transition, then this node may be unable to access shared devices. If the failure occurred during the 'release_shared_scsi2' transition, then a node which was joining the cluster may be unable to access shared devices. In either case, it may be possible to reacquire access to shared devices by executing '/usr/cluster/lib/sc/run_reserve -c node_join' on all cluster nodes. If the failure occurred during the 'make_primary' transition, then a device group has failed to start on this node. If another node was available to host the device group, then it should have been started on that node. If desired, it may be possible to switch the device group to this node with the scswitch command. If no other node was available, then the device group will not have been started. The scswitch command may be used to retry the attempt to start the device group. If the failure occurred during the 'primary_to_secondary' transition, then the shutdown or switchover of a device group has failed. The desired action may be retried.

767629 :lkcm_reg: Unix DLM version (%d) and the OSD library version (%d) are not compatible. Unix DLM versions acceptable to this library are: %d

Description:

Unix DLM and Oracle DLM are not compatible. Compatible versions will be printed as part of this message.

Solution:

Check installation procedure to make sure you have the correct versions of Oracle DLM and Unix DLM. Contact Sun service representative if versions cannot be resolved.

767858 :in libsecurity unknown security type %d

Description:

This is an internal error which shouldn't occur. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.

770355 :fatal: received signal %d

Description:

The daemon indicated in the message tag (rgmd or ucmmd) has received a SIGTERM signal, possibly caused by an operator-initiated kill(1) command. The daemon will produce a core file and will force the node to halt or reboot to avoid the possibility of data corruption.

Solution:

The operator must use scswitch(1M) and shutdown(1M) to take down a node, rather than directly killing the daemon.

770675 :monitor_check: fe_method_full_name() failed for resource <%s>, resource group <%s>

Description:

During execution of a scha_control(1HA,3HA) function, the rgmd was unable to assemble the full method pathname for the MONITOR_CHECK method. This is considered a MONITOR_CHECK method failure. This in turn will prevent the attempted failover of the resource group from its current master to a new master.

Solution:

770776 :INTERNAL ERROR: process_resource: Resource <%s> is R_BOOTING in PENDING_ONLINE resource group

Description:

The rgmd is attempting to bring a resource group online on a node where BOOT methods are still being run on its resources. This should not occur and may indicate an internal logic error in the rgmd.

Solution:

Look for other syslog error messages on the same node. Save a copy of the /var/adm/messages files on all nodes, and report the problem to your authorized Sun service provider.

771340 :fatal: Resource group <%s> update failed with error <%d>; aborting node

Description:

Rgmd failed to read updated resource group from the CCR on this node.

Solution:

Save a copy of the /var/adm/messages files on all nodes, and of the rgmd core file. Contact your authorized Sun service provider for assistance in diagnosing the problem.

772294 :%s requests reconfiguration in step %s

Description:

Return status at the end of a step execution indicates that a reconfiguration is required.

Solution:

None.

772395 :shutdown immediate did not succeed. (%s)

Description:

Failed to shutdown Oracle server using 'shutdown immediate' command.

Solution:

Examine 'Stop_timeout' property of the resource and increase 'Stop_timeout' if Oracle server takes long time to shutdown. and if you don't wish to use 'shutdown abort' for stopping Oracle server.

773078 :Error in configuration file lookup (%s, ...): %s

Description:

Could not read configuration file udlm.conf.

Solution:

Make sure udlm.conf exists under /opt/SUNWudlm/etc and has the correct permissions.

773366 :thread create for hb_threadpool failed

Description:

The system was unable to create thread used for heartbeat processing.

Solution:

774752 :reservation error(%s) - do_scsi3_inresv() error for disk %s

Description:

The device fencing program has encountered errors while trying to access a device. All retry attempts have failed.

Solution:

For the user action required by this message, see the user action for message 192619.

776199 :(%s) reconfigure: cm error %s

Description:

ucmm reconfiguration failed.

Solution:

776339 :INTERNAL ERROR: postpone_stop_r: meth type <%d>

Description:

A non-fatal internal error has occurred in the rgmd state machine.

Solution:

Since this problem might indicate an internal logic error in the rgmd, please save a copy of the /var/adm/messages files on all nodes, the output of an scstat -g command, and the output of a scrgadm -pvv command. Report the problem to your authorized Sun service provider.

778629 :ERROR: MONITOR_STOP method is not registered for ONLINE resource <%s>

Description:

A non-fatal internal error has occurred in the rgmd state machine.

Solution:

779073 :in fe_set_env_vars malloc of env_name[%d] failed

Description:

The rgmd server was not able to allocate memory for an environment variable, while trying to connect to the rpc.fed server, possibly due to low memory. An error message is output to syslog.

Solution:

Investigate if the host is running out of memory. If not save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.

779089 :Could not start up DCS client because we could not contact the name server.

Description:

There was a fatal error while this node was booting.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.

780283 :clcomm: Exception in coalescing region - Lost data

Description:

While supporting an invocation, the system wanted to combine buffers and failed. The system identifies the exception prior to this message.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.

780539:Stopping fault monitor: %s:%ld:%s

Description:

The fault monitor has detected an error and will be stopped. Error detected by fault monitor and action taken by fault monitor is indicated in message.

Solution:

None

780792 :Failed to retrieve the resource type information.

Description:

A Sun cluster data service has failed to retrieve the resource type's property information. Low memory or API call failure might be the reasons.

Solution:

781731 :Failed to retrieve the cluster handle: %s.

Description:

An API operation has failed while retrieving the cluster information.

Solution:

This may be solved by rebooting the node. For more details about API failure, check the messages from other components.

782111 :This list element in System property %s is missing a protocol: %s.

Description:

The system property that was named does not have a valid format. The value of the property must include a protocol.

Solution:

Add a protocol to the property value.

782694 :The value returned for property %s for resource %s was invalid.

Description:

An unexpected value was returned for the named property.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.

783130 :Failed to retrieve the node id for node %s: %s.

Description:

API operation has failed while retrieving the node id for the given node.

Solution:

Check whether the node name is valid. For more information about API call failure, check the messages from other components.

783581 :scvxvmlg fatal error - clconf_lib_init failed, returned %d

Description:

Solution:

If no configuration changes have been recently made to VxVM diskgroups or volumes and all volumes continue to be accessible from this node, then no action is required. If changes have been made, the device namespace on this node can be updated to reflect those changes by executing '/usr/cluster/lib/dcs/scvxvmlg'. If the problem persists, contact your authorized Sun service provider to determine whether a workaround or patch is available.

784311:Network_resources_used property not set properly.

Description:

There might be more than one logical IP address in this resource group, and the Network_resources_used property is not properly set to associate the resources with the appropriate backend hosts.

Solution:

Set the Network_resources_used property for each resource in the resource group to the logical IP address in the resource group that is configured to run Sun Cluster HA for BroadVision One-To-One Enterprise backend processes.

784560 :resource %s status on node %s change to %s

Description:

This is a notification from the rgmd that a resource's fault monitor status has changed.

Solution:

This is an informational message, no user action is needed.

784607 :Couldn't fork1.

Description:

The fork(1) system call failed.

Solution:

Some system resource has been exceeded. Install more memory, increase swap space or reduce peak memory consumption.

785101 :transition '%s' failed for cluster '%s': unknown code %d

Description:

The mentioned state transition failed for the cluster because of an unexpected command line option. udlmctl will exit.

Solution:

This is an internal error. Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.

785154 :Could not look up IP because IP was NULL.

Description:

The mapping for the given IP address in the local host files can't be done: the specified IP address is NULL.

Solution:

Check whether the IP address has NULL value. If this is the case, recreate the resource with valid host name. If this is not the reason, treat it as an internal error and contact Sun service provider.

785213 :reservation error(%s) - IOCDID_ISFIBRE failed for device %s,errno %d.

Description:

The device fencing program has suffered an internal error.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available. Copies of /var/adm/messages from all nodes should be provided for diagnosis. It may be possible to retry the failed operation, depending on the nature of the error. If the message specifies the 'node_join' transition, then this node may be unable to access shared devices. If the failure occurred during the 'release_shared_scsi2' transition, then a node which was joining the cluster may be unable to access shared devices. In either case, it may be possible to reacquire access to shared devices by executing '/usr/cluster/lib/sc/run_reserve -c node_join' on all cluster nodes. If the failure occurred during the 'make_primary' transition, then a device group has failed to start on this node. If another node was available to host the device group, then it should have been started on that node. If desired, it may be possible to switch the device group to this node with the scswitch command. If no other node was available, then the device group will not have been started. The scswitch command may be used to retry the attempt to start the device group. If the failure occurred during the 'primary_to_secondary' transition, then the shutdown or switchover of a device group has failed. The desired action may be retried.

786412 :reservation fatal error(UNKNOWN) - clconf_lib_init() error, returned %d

Description:

The device fencing program has suffered an internal error.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available. Copies of /var/adm/messages from all nodes should be provided for diagnosis. It may be possible to retry the failed operation, depending on the nature of the error. If the message specifies the 'node_join' transition, then this node may be unable to access shared devices. If the failure occurred during the 'release_shared_scsi2' transition, then a node which was joining the cluster may be unable to access shared devices. In either case, it may be possible to reacquire access to shared devices by executing '/usr/cluster/lib/sc/run_reserve -c node_join' on all cluster nodes. If the failure occurred during the 'make_primary' transition, then a device group has failed to start on this node. If another node was available to host the device group, then it should have been started on that node. If desired, it may be possible to switch the device group to this node with the scswitch command. If no other node was available, then the device group will not have been started. The scswitch command may be used to retry the attempt to start the device group. If the failure occurred during the 'primary_to_secondary' transition, then the shutdown or switchover of a device group has failed. The desired action may be retried.

786765 :Failed to get host names from resource %s.

Description:

The networking information for the resource could not be retrieved.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.

787063 :Error in getting parameters for global service <%s> of path <%s>: %s

Description:

Can not get information of global service.

Solution:

Save a copy of /var/adm/messages and contact your authorized Sun service provider to determine what is the cause of the problem.

787455 :Either extension property <stop_signal> is not defined, or an error occured while retrieving this property; using the default value of SIGTERM.

Description:

Property stop_signal may not be defined in RTR file. Continue the process with the default value of SIGTERM.

Solution:

This is an informational message, no user action is needed.

787529 :NAFO group %s has status %s. The status of the NAFO group will be checked again in %d seconds.

Description:

The NAFO group named is in a transition state. The status will be checked again.

Solution:

This is an informational message, no user action is needed.

788145 :gethostbyname() failed: %s.

Description:

gethostbyname() failed with unexpected error.

Solution:

Check if name service is configured correctly. Try some commands to query name serves, such as ping and nslookup, and correct the problem. If the error still persists, then reboot the node.

789223 :lkcm_sync: caller is not registered

Description:

udlm is not registered with ucmm.

Solution:

This is an internal error. Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.

789460 :monitor_check: call to rpc.fed failed for resource <%s>, resource group <%s>, method <%s>

Description:

A scha_control(1HA,3HA) GIVEOVER attempt failed, due to a failure of the rgmd to communicate with the rpc.fed daemon. If the rpc.fed process died, this might lead to a subsequent reboot of the node. Otherwise, this will prevent a resource group on the local node from failing over to an alternate primary node

Solution:

Examine other syslog messages occurring at about the same time to see if the problem can be identified and if it recurs. Save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance.

791495 :Unregistered syscall (%d)

Description:

An internal error has occurred. This should not happen. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.

791577:Waiting for the host $i to startup.

Description:

Waiting for the specified host to start.

Solution:

Bring the resource group containing the specified host online, if it is not running. If the resource group is online, no user action required because the Sun Cluster HA for BroadVision One-To-One Enterprise Probe should take appropriate action.

792109 :Unable to set number of file descriptors.

Description:

rpc.pmfd was unable to set the number of file descriptors used in the RPC server.

Solution:

Look for other syslog error messages on the same node. Save a copy of the /var/adm/messages files on all nodes, and report the problem to your authorized Sun service provider.

792338 :The property %s must contain at least one value.

Description:

The named property does not have a legal value.

Solution:

Assign the property a value.

792967 :Unable to parse configuration file.

Description:

While parsing the Netscape configuration file an error occurred in while either reading the file, or one of the fields within the file.

Solution:

Make sure that the appropriate configuration file is located in its default location with respect to the Confdir_list property.

793575 :Adaptive server terminated.

Description:

Graceful shutdown did not succeed. Adaptive server processes were killed in STOP method.

Solution:

Please check the permissions of file specified in the STOP_FILE extension property. File should be executable by the Sybase owner and root user.

794220 :switchback: bad nodename <%s>

Description:

The rgmd encountered a bad node name in the Nodelist of a resource group it was trying to failback. This might indicate corruption of CCR data or rgmd in-memory state.

Solution:

Use scrgadm(1M) -pvv to examine resource group properties. If the values appear corrupted, the CCR might have to be rebuilt. If values appear correct, this may indicate an internal error in the rgmd. Please contact your authorized Sun service provider for assistance in diagnosing and correcting the problem.

794535 :clcomm: Marshal Type mismatch. Expecting type %d got type %d

Description:

When MARSHAL_DEBUG is enabled, the system tags every data item marshalled to support an invocation. This reports that the current data item in the received message does not have the expected type. The received message format is wrong.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.

795047 :Stop fault monitor using pmfadm failed. tag %s error=%d

Description:

Failed to stop fault monitor will be stopped using Process Monitoring Facility (PMF), with the tag indicated in message. Error returned by PMF is indicated in message.

Solution:

Stop fault monitor processes. Please report this problem.

795062 :Stop fault monitor using pmfadm failed. tag %s error=%s

Description:

Failed to stop fault monitor will be stopped using Process Monitoring Facility (PMF), with the tag indicated in message. Error returned by PMF is indicated in message.

Solution:

Stop fault monitor processes. Please report this problem.

795381 :t_open: %s

Description:

Call to t_open() failed. The "t_open" man page describes possible error codes. udlm exits and the node will abort and panic.

Solution:

This is an internal error. Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.

796536 :Password file %s is not readable: %s

Description:

For the secure server to run, a password file named keypass is required. This file could not be read, which resulted in an error when trying to start the Data Service.

Solution:

Create the keypass file and place it under the Confdir_list path for this resource. Make sure that the file is readable.

796771 :check_for_ccrdata failed malloc of size %d

Description:

Call to malloc failed. The "malloc" man page describes possible reasons.

Solution:

Install more memory, increase swap space or reduce peak memory consumption.

797604 :CMM: Connectivity of quorum device %ld (%s) has been changed from 0x%llx to 0x%llx.

Description:

The number of configured paths to the specified quorum device has been changed as indicated. The connectivity information is depicted as bitmasks.

Solution:

This is an informational message, no user action is needed.

798060 :Error opening procfs status file <%s> for tag <%s>: %s

Description:

The rpc.pmfd server was not able to open a procfs status file, and the system error is shown. procfs status files are required in order to monitor user processes.

Solution:

Investigate if the machine is running out of memory. If this is not the case, save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.

798175 :sema_wait: %s

Description:

The rpc.pmfd server was not able to act on a semaphore. The message contains the system error. The server does not perform the action requested by the client, and an error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.

798514 :Starting fault monitor. pmf tag %s

Description:

Informational message. Fault monitor is being started under control of Process Monitoring Facility (PMF), with the tag indicated in message.

Solution:

None

798658 :Failed to get the resource type name: %s.

Description:

While retrieving the resource information, API operation has failed to retrieve the resource type name.

Solution:

This is internal error. Contact your authorized Sun service provider. For more error description, check the syslog messages.

799348 :INTERNAL ERROR: MONITOR_START method is not registered for resource <%s>

Description:

A non-fatal internal error has occurred in the rgmd state machine.

Solution:

799426 :clcomm: can't ifconfig private interface: %s:%d cmd %d error %d

Description:

The system failed to configure private network device for IP communications across the private interconnect of this device and IP address, resulting in the error identified in the message.

Solution:

Ensure that the network interconnect device is supported. Otherwise, Contact your authorized Sun service provider to determine whether a workaround or patch is available.