Sun Cluster 3.1 Error Messages Guide

Message IDs 700000–799999


700161 Fault monitor is already running.

Description:

The resource's fault monitor is already running.

Solution:

This is an internal error. Save the /var/adm/messages file from all the nodes. Contact your authorized Sun service provider.


700321 exec() of %s failed: %m.

Description:

The exec() system call failed for the given reason.

Solution:

Verify that the pathname given is valid.


701136 Failed to stop monitor server.

Description:

Sun Cluster HA for Sybase failed to stop monitor server using KILL signal.

Solution:

Please examine whether any Sybase server processes are running on the server. Please manually shutdown the server.


701567 Unable to bind door %s: %s

Description:

Solution:


702148 internal error


703156 scha_control GIVEOVER failed with error code: %s

Description:

Fault monitor had detected problems in Oracle listener. Attempt to switchover resource to another node failed. Error returned by API call scha_control is indicated in the message.

Solution:

Check Oracle listener setup. Please make sure that Listener_name specified in the resource property is configured in listener.ora file. Check 'Host' property of listener in listener.ora file. Examine log file and syslog messages for additional information.


703450 Despite the warnings, the validation of the hostname list succeeded


703476 clcomm: unable to create desired unref threads

Description:

The system was unable to create threads that deal with no longer needed objects. The system fails to create threads when memory is not available. This message can be generated by the inability of either the kernel or a user level process. The kernel creates unref threads when the cluster starts. A user level process creates threads when it initializes.

Solution:

Take steps to increase memory availability. The installation of more memory will avoid the problem with a kernel inability to create threads. For a user level process problem: install more memory, increase swap space, or reduce the peak work load.


703553 Resource group name or resource name is too long.

Description:

Process monitor facility is failed to execute the command. Resource group name or resource name is too long for the process monitor facility command.

Solution:

Check the resource group name and resource name. Give short name for resource group or resource .


703744 reservation fatal error(%s) - get_cluster_state() exception

Description:

The device fencing program has suffered an internal error.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available. Copies of /var/adm/messages from all nodes should be provided for diagnosis. It may be possible to retry the failed operation, depending on the nature of the error. If the message specifies the 'node_join' transition, then this node may be unable to access shared devices. If the failure occurred during the 'release_shared_scsi2' transition, then a node which was joining the cluster may be unable to access shared devices. In either case, it may be possible to reacquire access to shared devices by executing '/usr/cluster/lib/sc/run_reserve -c node_join' on all cluster nodes. If the failure occurred during the 'make_primary' transition, then a device group has failed to start on this node. If another node was available to host the device group, then it should have been started on that node. If desired, it may be possible to switch the device group to this node with the scswitch command. If no other node was available, then the device group will not have been started. The scswitch command may be used to retry the attempt to start the device group. If the failure occurred during the 'primary_to_secondary' transition, then the shutdown or switchover of a device group has failed. The desired action may be retried.


704795 in libsecurity could not negotiate uid on any transport in NETPATH

Description:

A server (rpc.pmfd, rpc.fed or rgmd) was not able to start because it could not establish a rpc connection for the network specified, because it couldn't find any transport. This happened because either there are no available transports at all, or there are but none is a loopback. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


705163 load balancer thread failed to start for %s

Description:

The system has run out of resources that is required to create a thread. The system could not create the load balancer thread.

Solution:

The service group is created with the default load balancing policy. If rebalancing is required, free up resources by shutting down some processes. Then delete the service group and re-create it.


705629 clutil: Can't allocate hash table

Description:

The system attempted unsuccessfully to allocate a hash table. There was insufficient memory.

Solution:

Install more memory, increase swap space, or reduce peak memory consumption.


705693 listen: %s

Description:

Solution:


706159 Failed to switchover resource group %s: %s

Description:

An attempt to switchover the specified resource group failed. The reason for the failure is logged.

Solution:

Look for the message indicating the reason for this failure. This should help in the diagnosis of the problem.


706314 clexecd: Error %d from open(/dev/zero). Exiting.

Description:

clexecd program has encountered a failed open(2) system call. The error message indicates the error number for the failure.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


707421 %s: Cannot create a thread.

Description:

Solaris has run out of its limit on threads. Either too many clients are requesting a service, causing many threads to be created at once or system is overloaded with processes.

Solution:

Reduce system load by reducing number of requestors of this service or halting other processes on the system.


707881 clcomm: thread_create failed for autom_thread

Description:

The system could not create the needed thread, because there is inadequate memory.

Solution:

There are two possible solutions. Install more memory. Alternatively, reduce memory usage. Since this happens during system startup, application memory usage is normally not a factor.


707948 launching method <%s> for resource <%s>, resource group <%s>, timeout <%d> seconds

Description:

RGM has invoked a callback method for the named resource, as a result of a cluster reconfiguration, scha_control GIVEOVER, or scswitch.

Solution:

This is an informational message, no user action is needed.


708422 Command {%s} failed: %s.

Description:

The command noted did not return the expected value. Additional information may be found in the error message after the ':', or in subsequent messages in syslog.

Solution:

This message is issued from a general purpose routine. Appropriate action may be indicated by the additional information in the message or in syslog.


708825 Failed to validate IPMP group name <%s> pnm errorcode <%d>.


709082 "pmfadm -k": Can not signal <%s>: Monitoring is not resumed on pid %d

Description:

The command 'pmfadm -k' can not be executed on the given tag because the monitoring is suspended on the indicated pid.

Solution:

Resume the monitoring on the indicated pid with the 'pmfctl -R' command.


710143 Failed to add node %d to scalable service group %s: %s.

Description:

A call to the underlying scalable networking code failed.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


711956 open /dev/ip failed: %s.

Description:

System was attempting to open the specified device, but was unable to do so.

Solution:

This might be the result of lack of the system resources. Check whether the system is low in memory and take appropriate action. For specific error information check the syslog message.


712367 clcomm: Endpoint %p: deferred task not allowed in state %d

Description:

The system maintains information about the state of an Endpoint. A deferred task is not allowed in this state.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


712591 Validation failed. Resource group property FAILBACK must be FALSE

Description:

The resource being created or modified must belong to a group that must have a value of FALSE for it's FAILBACK property.

Solution:

Specify FALSE for the FAILBACK property.


713428 Confdir_list must be an absolute path.

Description:

The entries in Confdir_list must be an absolute path (start with '/').

Solution:

Create the resource with absolute paths in Confdir_list.


714123 Stopping the backup server.

Description:

The backup server is about to be brought down by Sun Cluster HA forSybase.

Solution:

This is an information message, no user action is needed.


714173 Load balancer setting distribution.


714208 Starting liveCache timed out with command %s.

Description:

Starting liveCache timed out.

Solution:

Look for syslog error messages on the same node. Save a copy of the /var/adm/messages files on all nodes, and report the problem to your authorized Sun service provider.


714838 reservation fatal error(%s) - Unable to open name file '%s', errno %d

Description:

The device fencing program has suffered an internal error.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available. Copies of /var/adm/messages from all nodes should be provided for diagnosis. It may be possible to retry the failed operation, depending on the nature of the error. If the message specifies the 'node_join' transition, then this node may be unable to access shared devices. If the failure occurred during the 'release_shared_scsi2' transition, then a node which was joining the cluster may be unable to access shared devices. In either case, it may be possible to reacquire access to shared devices by executing '/usr/cluster/lib/sc/run_reserve -c node_join' on all cluster nodes. If the failure occurred during the 'make_primary' transition, then a device group has failed to start on this node. If another node was available to host the device group, then it should have been started on that node. If desired, it may be possible to switch the device group to this node with the scswitch command. If no other node was available, then the device group will not have been started. The scswitch command may be used to retry the attempt to start the device group. If the failure occurred during the 'primary_to_secondary' transition, then the shutdown or switchover of a device group has failed. The desired action may be retried.


715958 Method <%s> on resource <%s> stopped due to receipt of signal <%d>

Description:

A resource method was stopped by a signal, most likely resulting from an operator-issued kill(1). The method is considered to have failed.

Solution:

The operator must kill the stopped method. The operator may then choose to issue an scswitch(1M) command to bring resource groups onto desired primaries, or re-try the administrative action that was interrupted by the method failure.


716023 BV1TO1 variable not set.

Description:

The BV1TO1 variable is not configured in bv1to1.conf file.

Solution:

Reconfigure the Broadvision site properly with proper BV1TO1value.


716253 launch_fed_prog: fe_set_env_vars() failed for program <%s>, step <%s>

Description:

The ucmmd server was not able to get the locale environment. An error message is output to syslog.

Solution:

Investigate if the host is running out of memory. If not save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


718325 Failed to stop development system within %d seconds. Will continue to stop the development system in the background. Meanwhile, the production system Central Instance is started up now.

Description:

Failed to shutdown the development system within the timeout period. It will be continuously shutting down in the background. Meanwhile, the Central instance will be started up.

Solution:

No action needed. You might consider increasing the Dev_stop_pct property or Start_timeout property.


718457 Dispatcher Process is not running. pid was %d

Description:

The main dispatcher process is not present in the process list indicating the main dispatcher is not running on this node.

Solution:

No action needed. Fault monitor will detect that the main dispatcher process is not running, and take appropriate action.


719114 Failed to parse key/value pair from command line for %s.

Description:

The validate method for the scalable resource network configuration code was unable to convert the property information given to a usable format.

Solution:

Verify the property information was properly set when configuring the resource.


719497 clcomm: path_manager using RT lwp rather than clock interrupt

Description:

The system has been built to use a real time thread to support path_manager heart beats instead of the clock interrupt.

Solution:

No user action is required.


719682 fopen: %s

Description:

Solution:


719997 Failed to pre-allocate swap space

Description:

The pmfd, fed, or other program was not able to allocate swap space. This means that the machine is low in swap space. The server does not come up, and an error message is output to syslog.

Solution:

Investigate if the machine is running out of swap. If this is not the case, save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


721252 cm2udlm: cm_getclustmbyname: %s

Description:

Could not create a structure for communication with the cluster monitor process.

Solution:

This is an internal error. Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.


721263 Extension property <stop_signal> has a value of <%d>

Description:

Resource property stop_signal is set to a value or has a default value.

Solution:

This is an informational message, no user action is needed.


721650 Siebel server not running.

Description:

Siebel server may not be running.

Solution:

This is an informative message. Fault Monitor should either restart or failover the Siebel server resource. This message may also be generated during the start method while waiting for the service to come up.


721881 dl_attach: kstr_msg failed %d error

Description:

Could not attach to the private interconnect.

Solution:

Reboot of the node might fix the problem.


722270 fatal: cannot create state machine thread

Description:

The rgmd was unable to create a thread upon starting up. This is a fatal error. The rgmd will produce a core file and will force the node to halt or reboot to avoid the possibility of data corruption.

Solution:

Make sure that the hardware configuration meets documented minimum requirements. Save a copy of the /var/adm/messages files on all nodes, and of the rgmd core file. Contact your authorized Sun service provider for assistance in diagnosing the problem.


722439 Restarting using scha_control RESOURCE_RESTART

Description:

Fault monitor has detected problems in RDBMS server. Attempt will be made to restart RDBMS server on the same node.

Solution:

Check the cause of RDBMS failure.


722768 %s: could not get network addresses.

Description:

The daemon is unable to get net addresses of itself and caller.


722904 Failed to open the resource group handle: %s.

Description:

An API operation has failed while retrieving the resource group property. Low memory or API call failure might be the reasons.

Solution:

In case of low memory, the problem will probably cured by rebooting. If the problem reoccurs, you might need to increase swap space by configuring additional swap devices. Otherwise, if it is API call failure, check the syslog messages from other components. For resource group name and the property name, check the current syslog message.


722984 call to rpc.fed failed for resource <%s>, resource group <%s>, method <%s>

Description:

The rgmd failed in an attempt to execute a method, due to a failure to communicate with the rpc.fed daemon. Depending on which method was being invoked and the Failover_mode setting on the resource, this might cause the resource group to fail over or move to an error state. If the rpc.fed process died, this might lead to a subsequent reboot of the node.

Solution:

Examine other syslog messages occurring at about the same time to see if the problem can be identified. Save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance in diagnosing the problem.


723206 SAP is already running.

Description:

SAP is already running either locally on this node or remotely on a different node in the cluster outside of the control of the Sun Cluster.

Solution:

Need to shut down SAP first, before start up SAP under the control of Sun Cluster.


724035 Failed to connect to %s secure port %d.

Description:

An error occured while the fault monitor was trying to connect to a secure port specified in the Port_list property for this resource.

Solution:

Check to make sure that the Port_list property is correctly set to the same port number that the Netscape Directory Server is running on.


726004 Invalid timeout value %d passed.

Description:

Failed to execute the command under the specified timeout. The specified timeout is invalid.

Solution:

Respecify a positive, non-zero timeout value.


726417 read %d for %sport

Description:

Could not get the port information from config file udlm.conf.

Solution:

Check to make sure udlm.conf file exist and has entry for udlm.port. If everything looks normal and the problem persists, contact your Sun service representative.


727065 CMM: Enabling failfast on quorum device %s failed with error %d.

Description:

An attempt to enable failfast on the specified quorum device failed with the specified error.

Solution:

Check if the specified quorum disk has failed. This message may also be logged when a node is booting up and has been preempted from the cluster, in which case no user action is necessary.


727160 msg of wrong version %d, expected %d

Description:

udlmctl received an illegal message.

Solution:

None. udlm will handle this error.


728216 reservation error(%s) - did_get_path() error

Description:

The device fencing program has suffered an internal error.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available. Copies of /var/adm/messages from all nodes should be provided for diagnosis. It may be possible to retry the failed operation, depending on the nature of the error. If the message specifies the 'node_join' transition, then this node may be unable to access shared devices. If the failure occurred during the 'release_shared_scsi2' transition, then a node which was joining the cluster may be unable to access shared devices. In either case, it may be possible to reacquire access to shared devices by executing '/usr/cluster/lib/sc/run_reserve -c node_join' on all cluster nodes. If the failure occurred during the 'make_primary' transition, then a device group has failed to start on this node. If another node was available to host the device group, then it should have been started on that node. If desired, it may be possible to switch the device group to this node with the scswitch command. If no other node was available, then the device group will not have been started. The scswitch command may be used to retry the attempt to start the device group. If the failure occurred during the 'primary_to_secondary' transition, then the shutdown or switchover of a device group has failed. The desired action may be retried.


728425 INTERNAL ERROR: bad state <%s> (%d) for resource group <%s> in rebalance()

Description:

An internal error has occurred in the rgmd. This may prevent the rgmd from bringing the affected resource group online.

Solution:

Look for other syslog error messages on the same node. Save a copy of the /var/adm/messages files on all nodes, and report the problem to your authorized Sun service provider.


728881 Failed to read data: %s.

Description:

Failed to read the data from the socket. The reason might be expiration of timeout, hung application or heavy load.

Solution:

Check if the application is hung. If this is the case, restart the appilcation.


728928 CCR: Can't access table %s on node %s errno = %d.

Description:

The indicated error occurred when CCR was tried to access the indicated table on the nodes in the cluster. The errno value indicates the nature of the problem. errno values are defined in the file /usr/include/sys/errno.h. An errno value of 28(ENOSPC) indicates that the root files system on the node is full. Other values of errno can be returned when the root disk has failed(EIO).

Solution:

There may be other related messages on the node where the failure occurred. They may help diagnose the problem. If the root file system is full on the node, then free up some space by removing unnecessary files. If the root disk on the afflicted node has failed, then it needs to be replaced. If the indicated table was accidently removed, boot the indicated node in -x mode to restore the indicated table from backup. The CCR tables are located at /etc/cluster/ccr/.


729152 clexecd: Error %d from F_SETFD. Exiting.

Description:

clexecd program has encountered a failed fcntl(2) system call. The error message indicates the error number for the failure.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


730190 scvxvmlg error - found non device-node or non link %s, directory not removed

Description:

The program responsible for maintaining the VxVM namespace had detected suspicious entries in the global device namespace.

Solution:

The global device namespace should only contain diskgroup directories and volume device nodes for registered diskgroups. The specified path was not recognized as either of these and should be removed from the global device namespace.


730685 PCSTATUS: %s

Description:

The rpc.pmfd server was not able to monitor a process, and the system error is shown. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


730782 Failed to update scalable service: Error %d.

Description:

Update to a property related to scalability was not successfully applied to the system.

Solution:

Use scswitch to try to bring resource offline and online again on this node. If the error persists, reboot the node and contact your Sun service representative.


730956 %d entries found in property %s. For a nonsecure Netscape Directory Server instance %s should have exactly one entry.

Description:

Since a nonsecure Netscape Directory Server instance only listens on a single port, the list property should only have a single entry. A different number of entries was found.

Solution:

Change the number of entries to be exactly one.


731263 %s: run callback had a NULL event The run_callback() routine is called only when an IPMP group's state changes from OK to DOWN and also when an IPMP group is updated (adapter added to the group).

Solution:

Save a copy of the /var/adm/messages files on the node. Contact your authorized Sun service provider for assistance in diagnosing the problem.


731616 No memory.


732069 dl_attach: DL_ERROR_ACK protocol error

Description:

Could not attach to the physical device.

Solution:

Check the documentation for the driver associated with the private interconnect. It might be that the message returned is too small to be valid.


732569 reservation error(%s) error. Not found clexecd on node %d.

Description:

The device fencing code was unable to communicate with another cluster node.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available. Copies of /var/adm/messages from all nodes should be provided for diagnosis. It may be possible to retry the failed operation, depending on the nature of the error. If the message specifies the 'node_join' transition, then this node may be unable to access shared devices. If the failure occurred during the 'release_shared_scsi2' transition, then a node which was joining the cluster may be unable to access shared devices. In either case, it may be possible to reacquire access to shared devices by executing '/usr/cluster/lib/sc/run_reserve -c node_join' on all cluster nodes. If the failure occurred during the 'make_primary' transition, then a device group has failed to start on this node. If another node was available to host the device group, then it should have been started on that node. If desired, it may be possible to switch the device group to this node with the scswitch command. If no other node was available, then the device group will not have been started. The scswitch command may be used to retry the attempt to start the device group. If the failure occurred during the 'primary_to_secondary' transition, then the shutdown or switchover of a device group has failed. The desired action may be retried.


732643 scha_control: warning: cannot store %s restart timestamp for resource group <%s> resource <%s>: time() failed, errno <%d> (%s)

Description:

A time() system call has failed. This prevents updating the history of scha_control restart calls. This could cause the scha_resource_get (NUM_RESOURCE_RESTARTS) or (NUM_RG_RESTARTS) query to return an inaccurate value on this node. This in turn could cause a failing resource to be restarted continually rather than failing over to another node. However, this problem is very unlikely to occur.

Solution:

If this message is produced and it appears that a resource or resource group is continually restarting without failing over, try switching the resource group to another node. Other syslog error messages occurring on the same node might provide further clues to the root cause of the problem.


732787 bv1to1.conf.sh file is not found in the %s/etc directory

Description:

The bv1to1.conf .sh file is not accessible.

Solution:

Check if the file exists in $BV1TO1_VAR/etc/bv1to1.conf.sh. If the file exists in this directory check if the BV1TO1_VAR extension property is correctly set.


732822 clconf: Invalid group name

Description:

An invalid group name has been encountered while converting a group name to clconf_obj type. Valid group names are "cluster", "nodes", "adapters", "ports", "blackboxes", "cables", and "quorum_devices".

Solution:

This is an unrecoverable error, and the cluster needs to be rebooted. Also contact your authorized Sun service provider to determine whether a workaround or patch is available.


733367 lkcm_act: %s: %s cm_reconfigure failed

Description:

ucmm reconfiguration failed.

Solution:

None if the next reconfiguration succeeds. If not, save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.


734057 clcomm: Duplicate TypeId's: %s : %s

Description:

The system records type identifiers for multiple kinds of type data. The system checks for type identifiers when loading type information. This message identifies two items having the same type identifiers. This checking only occurs on debug systems.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


734173 No hosts are configured in bv1to1.conf.

Description:

There are no hosts configured to run Broadvision processes.

Solution:

Configure the BV hosts in bv1to1.conf file properly.


734832 clutil: Created insufficient threads in threadpool

Description:

There was insufficient memory to create the desired number of threads.

Solution:

Install more memory, increase swap space, or reduce peak memory consumption.


734890 pthread_detach: %s

Description:

The rpc.pmfd server was not able to detach a thread, possibly due to low memory. The message contains the system error. The server does not perform the action requested by the client, and an error message is output to syslog.

Solution:

Investigate if the machine is running out of memory. If all looks correct, save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


735336 Media error encountered, but Auto_end_bkp is disabled.

Description:

The HA-Oracle start method identified that one or more datafiles is in need of recovery. The Auto_end_bkp extension property is disabled so no further recovery action was taken.

Solution:

Examine the log files for the cause of the media error. If it's caused by datafiles being left in hot backup mode, the Auto_end_bkp extension property should be enabled or the datafiles should be recovered manually.


737104 Received unexpected result <%d> from rpc.fed, aborting node

Description:

This node encountered an unexpected error while communicating with other cluster nodes during a cluster reconfiguration. The ucmmd will produce a core file and will cause the node to halt or reboot.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


737125 INTERNAL ERROR: PENDING_OFF_STOP_FAILED or ERROR_STOP_FAILED in rebalance()

Description:

An internal error has occurred in the rgmd. This may prevent the rgmd from bringing affected resource groups online.

Solution:

Look for other syslog error messages on the same node. Save a copy of the /var/adm/messages files on all nodes, and report the problem to your authorized Sun service provider.


737445 rebalance: not attempting to start resource group <%s> on node <%s> because this resource group has already failed to start on this node %d or more times in the past %d seconds

Description:

The rgmd is preventing "ping-pong" failover of the resource group, i.e., repeated failover of the resource group between two or more nodes.

Solution:

The time interval in seconds that is mentioned in the message can be adjusted by using scrgadm(1M) to set the Pingpong_interval property of the resource group.


737572 PMF error when starting Sybase %s: %s. Error: %s

Description:

Sun Cluster HA for Sybase failed to start sybase server using Process Monitoring Facility (PMF). Other syslog messages and the log file will provide additional information on possible reasons for the failure.

Solution:

Please whether the server can be started manually. Examine the HA-Sybase log files, sybase log files and setup.


738465 Malformed adapter specification %s.

Description:

Failed to retrieve the ipmp information. The given adapter specification is invalid.

Solution:

Check whether the adapter specification is in the form of ipmpgroup@nodename. If not, recreate the resource with the properly formatted adapter information.


738847 clexecd: unable to create failfast object.

Description:

clexecd problem could not enable one of the mechanisms which causes the node to be shutdown to prevent data corruption, when clexecd program dies.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


739356 warning: cannot store start_failed timestamp for resource group <%s>: time() failed, errno <%d> (%s)

Description:

The specified resource group failed to come online on some node, but this node is unable to record that fact due to the failure of the time(2) system call. The consequence of this is that the resource group may continue to pingpong between nodes for longer than the Pingpong_interval property setting.

Solution:

Examine other syslog messages occurring around the same time on the same node, to see if the cause of the problem can be identified. If the same error recurs, you might have to reboot the affected node.


739653 Port number %d is listed twice in property %s, at entries %d and %d.

Description:

The port number in the message was listed twice in the named property, at the list entry locations given in the message. A port number should only appear once in the property.

Solution:

Specify the property with only one occurrence of the port number.


740373 Failed to get the scalable service related properties for resource %s.

Description:

An unexpected error occurred while trying to collect the properties related to scalable networking for the named resource.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


740972 in fe_set_env_vars setlocale failed

Description:

The rgmd server was not able to get the locale environment, while trying to connect to the rpc.fed server. An error message is output to syslog.

Solution:

Investigate if the host is running out of memory. If not save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


742337 Node %d is in the %s for resource %s, but property %s identifies resource %s which cannot host an address on node %d.

Description:

All IP addresses used by this resource must be configured to be available on all nodes that the scalable resource can run on.

Solution:

Either change the resource group nodelist to exclude the nodes that cannot host the SharedAddress IP address, or select a different network resource whose IP address will be available on all nodes where this scalable resource can run.


743362 could not read failfast mode, using panic

Description:

/opt/SUNWudlm/etc/udlm.conf did not have an entry for failfast mode. Default mode of 'panic' will be used.

Solution:

None.


743923 Starting server with command %s.

Description:

Sun Cluster is starting the application with the specified command.

Solution:

This is an informational message, no user action is needed.


744837 No executable $BV1TO1/bin/bvconf

Description:

The specified executable is not foundAction:Check if the Broadvision software was installed properly. Make sure the specified executable is available at theright location.


744866 Failed to check status of SUNW.HAStoragePlus resource.

Description:

An error occured while checking the status of the SUNW.HAStoragePlus resource that this resource depends on.

Solution:

Check syslog messages and correct the problems specified in prior syslog messages. If the error still persists, please report this problem.


745100 CMM: Quorum device %ld has been changed from %s to %s.

Description:

The name of the specified quorum device has been changed as indicated. This can happen if while this node was down, the previous quorum device was removed from the cluster, and a new one was added (and assigned the same id as the old one) to the cluster.

Solution:

This is an informational message, no user action is needed.


745275 PNM daemon system error: %s

Description:

A system error has occured in the PNM daemon. This could be because of the resources on the system being very low. eg: low memory.

Solution:

If the message is: out of memory - increase the swap space, install more memory or reduce peak memory consumption. Otherwise the error is unrecoverable, and the node needs to be rebooted. can't open file - check the "open" man page for possible error. fcntl error - check the "fcntl" man page for possible errors. poll failed - check the "poll" man page for possible errors. socket failed - check the "socket" man page for possible errors. SIOCGLIFNUM failed - check the "ioctl" man page for possible errors. SIOCGLIFCONF failed - check the "ioctl" man page for possible errors. wrong address family - check the "ioctl" man page for possible errors. SIOCGLIFFLAGS failed - check the "ioctl" man page for possible errors. SIOCGLIFADDR failed - check the "ioctl" man page for possible errors. rename failed - check the "rename" man page for possible errors. SIOCGLIFGROUPNAME failed - check the "ioctl" man page for possible errors. setsockopt (SO_REUSEADDR) failed - check the "setsockopt" man page for possible errors. bind failed - check the "bind" man page for possible errors. listen failed - check the "listen" man page for possible errors. read error - check the "read" man page for possible errors. SIOCSLIFGROUPNAME failed - check the "ioctl" man page for possible errors. SIOCSLIFFLAGS failed - check the "ioctl" man page for possible errors. SIOCGLIFNETMASK failed - check the "ioctl" man page for possible errors. SIOCSLIFADDR failed - check the "ioctl" man page for possible errors. SIOCLIFREMOVEIF failed - check the "ioctl" man page for possible errors. SIOCSLIFNETMASK failed - check the "ioctl" man page for possible errors. write error - check the "write" man page for possible errors. accept failed - check the "accept" man page for possible errors. wrong peerlen %d - check the "accept" man page for possible errors. gethostbyname failed %s - make sure entries in /etc/hosts, /etc/nsswitch.conf and /etc/netconfig are correct to get information about this host. SIOCGIFARP failed - check the "ioctl" man page for possible errors. Check the arp cache to see if all the adapters in the node have their entries. can't install SIGTERM handler - check the man page for possible errors.


747567 Unable to complete any share commands.

Description:

None of the paths specified in the dfstab.<resource-name> file were shared successfully.

Solution:

The prenet_start method would fail. Sun Cluster resource management would attempt to bring the resource on-line on some other node. Manually check that the paths specified in the dfstab.<resource-name> file are correct.


748729 clconf: Failed to open table infrastructure in unregister_infr_callback

Description:

Failed to open table infrastructure in unregistered clconf callback with CCR. Table infrastructure not found.

Solution:

Check the table infrastructure.


749409 clcomm: validate_policy: high not enough. high %d low %d in c %d nodes %d pool %d

Description:

The system checks the proposed flow control policy parameters at system startup and when processing a change request. For a variable size resource pool, the high server thread level must be large enough to allow all of the nodes identified in the message join the cluster and receive a minimal number of server threads.

Solution:

No user action required.


749958 CMM: Unable to create %s thread.

Description:

The CMM was unable to create its specified thread and the system can not continue. This is caused by inadequate memory on the system.

Solution:

Add more memory to the system. If that does not resolve the problem, contact your authorized Sun service provider to determine whether a workaround or patch is available.


751079 scha_cluster_open failed

Description:

Call to initialize a handle to get cluster information failed. This means that the incoming connection to the PNM daemon will not be accepted.

Solution:

There could be other related error messages which might be helpful. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


751258 TCPTR: Node %u attempting to join cluster has incompatible cluster software. \"%s\" on node %u not compatible with \"%s\" on node %u.

Description:

Tranport at the local node received an initial handshake message from the remote node that is not running a compatible version of the suncluster software.

Solution:

Make sure all nodes in the cluster are running compatible versions of sun cluster software.


751934 scswitch: rgm_change_mastery() failed with NOREF, UNKNOWN, or invalid error on node %d

Description:

An inter-node communication failed with an unknown exception while the rgmd was attempting to execute an operator-requested switch of the primaries of a resource group, or was attempting to "fail back" a resource group onto a node that just rejoined the cluster. This will cause the attempted switching action to fail.

Solution:

Examine other syslog messages occurring around the same time on the same node, to see if the cause of the problem can be identified. If the switch was operator-requested, retry it. If the same error recurs, you might have to reboot the affected node. Since this problem might indicate an internal logic error in the clustering software, please save a copy of the /var/adm/messages files on all nodes, the output of an scstat -g command, and the output of a scrgadm -pvv command. Report the problem to your authorized Sun service provider.


752212 Failed to retrieve the resource handle: %s


753155 Starting fault monitor. pmf tag %s.

Description:

The fault monitor is being started under control of the Process Monitoring Facility (PMF), with the tag indicated in the message.

Solution:

This is an information message, no user action is needed.


754283 pipe: %s

Description:

The rpc.fed server was not able to create a pipe. The message contains the system error. The server will not capture the output from methods it runs.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


754521 Property %s does not have a value. This property must have exactly one value.

Description:

The property named does not have a value specified for it.

Solution:

Set the property to have exactly one value.


754848 The property %s must contain at least one SharedAddress network resource.

Description:

The named property must contain at least one SharedAddress.

Solution:

Specify a SharedAddress resource for this property.


756082 clcomm:Cannot fork() after ORB server initialization.

Description:

A user level process attempted to fork after ORB server initialization. This is not allowed.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


756650 Failed to set the global interface node to %d for IP %s: %s.

Description:

A call to the underlying scalable networking code failed.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


757236 Error initializing LDAP library to probe %s port %d for non-secure resource %s: %s

Description:

An error occurred while initializing the LDAP library. The error message will contain the error returned by the library.

Solution:

Investigate if the machine is running out of memory. If this is not the case, save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


757581 Failed to stop daemon %s.

Description:

The HA-NFS implementation was unable to stop the specified daemon.

Solution:

The resource could be in a STOP_FAILED state. If the failover mode is set to HARD, the node would get automatically rebooted by the SunCluster resource management. If the Failover_mode is set to SOFT or NONE, please check that the specified daemon is indeed stopped (by killing it by hand, if necessary). Then clear the STOP_FAILED status on the resource and bring it on-line again using the scswitch command.


757758 scvxvmlg error - getminor called with a bad filename: %s

Description:

The program responsible for maintaining the VxVM namespace has suffered an internal error. If configuration changes were recently made to VxVM diskgroups or volumes, this node may be unaware of those changes. Recently created volumes may be unaccessible from this node.

Solution:

If no configuration changes have been recently made to VxVM diskgroups or volumes and all volumes continue to be accessible from this node, then no action is required. If changes have been made, the device namespace on this node can be updated to reflect those changes by executing '/usr/cluster/lib/dcs/scvxvmlg'. If the problem persists, contact your authorized Sun service provider to determine whether a workaround or patch is available.


760001 (%s) netconf error: cannot get transport info for 'ticlts' %s

Description:

Call to getnetconfigent failed and udlmctl could not get network information. udlmctl will exit.

Solution:

Make sure the interconnect does not have any problems. Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.


760354 modinstall of cldlpihb failed

Description:

The streams module that intercepts heartbeat messages could not be installed.


761076 dl_bind: DL_ERROR_ACK protocol error

Description:

Could not bind to the physical device. We are trying to open a fast path to the private transport adapters.

Solution:

Reboot of the node might fix the problem.


762902 Failed to restart fault monitor.

Description:

The resource property that was updated needed the fault monitor to be restarted in order for the change to take effect, but the attempt to restart the fault monitor failed.

Solution:

Look at the prior syslog messages for specific problems. Correct the errors if possible. Look for the process <dataservice>_probe operating on the desired resource (indicated by the argument to "-R" option). This can be found from the command: ps -ef | egrep <dataservice>_probe | grep "\-R <resourcename>" Send a kill signal to this process. If the process does not get killed and restarted by the process monitor facility, reboot the node.


763305 realloc: %d

Description:

Solution:


763570 can't start pnmd due to lock

Description:

An attempt was made to start multiple instances of the PNM daemon pnmd(1M), or pnmd(1M) has problem acquiring a lock on the file (/var/cluster/run/pnm_lock).

Solution:

Check if another instance of pnmd is already running. If not, remove the lock file (/var/cluster/run/pnm_lock) and start pnmd by sending KILL (9) signal to pnmd. PMF will restart pnmd automatically.


763781 For global service <%s> of path <%s>, local node is less preferred than node <%d>. But affinity switch over may still be done.

Description:

A service is switched to a less preferred node due to affinity switchover of SUNW.HAStorage prenet_start method.

Solution:

Check which configuration can gain more performance benefit, either to leave the service on its most preferred node or let the affinity switchover take effect. Using scswitch(1m) to switch it back if necessary.


763929 HA: rm_service_thread_create failed

Description:

The system could not create the needed thread, because there is inadequate memory.

Solution:

There are two possible solutions. Install more memory. Alternatively, reduce memory usage.


765087 uname: %s

Description:

The rpc.fed server encountered an error with the uname function. The message contains the system error.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


765395 clcomm: RT class not configured in this system

Description:

Sun Cluster requires that the real time thread scheduling class be configured in the kernel.

Solution:

Configure Solaris with the RT thread scheduling class in the kernel.


766093 IP address (hostname) and Port pairs %s%c%d%c%s and %s%c%d%c%s in property %s, at entries %d and %d, effectively duplicate each other. The port numbers are the same and the resolved IP addresses are the same.

Description:

The two list entries at the named locations in the named property have port numbers that are identical, and also have IP address (hostname) strings that resolve to the same underlying IP address. An IP address (hostname) string and port entry should only appear once in the property.

Solution:

Specify the property with only one occurrence of the IP address (hostname) string and port entry.


766316 Started saposcol process under PMF successfully.

Description:

The SAP OS collector process is started successfully under the control of the Process monitor facility.

Solution:

Informational message. No user action needed.


767363 CMM: Disconnected from node %ld; aborting using %s rule.

Description:

Due to a connection failure between the local and the specified node, the local node must be halted to avoid a "split brain" configuration. The CMM used the specified rule to decide which node to fail. Rules are: rebootee: If one node is rebooting and the other was a member of the cluster, the node that is rebooting must abort. quorum: The node with greater control of quorum device votes survives and the other node aborts. node number: The node with higher node number aborts.

Solution:

The cause of the failure should be resolved and the node should be rebooted if node failure is unexpected.


767488 reservation fatal error(UNKNOWN) - Command not specified

Description:

The device fencing program has suffered an internal error.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available. Copies of /var/adm/messages from all nodes should be provided for diagnosis. It may be possible to retry the failed operation, depending on the nature of the error. If the message specifies the 'node_join' transition, then this node may be unable to access shared devices. If the failure occurred during the 'release_shared_scsi2' transition, then a node which was joining the cluster may be unable to access shared devices. In either case, it may be possible to reacquire access to shared devices by executing '/usr/cluster/lib/sc/run_reserve -c node_join' on all cluster nodes. If the failure occurred during the 'make_primary' transition, then a device group has failed to start on this node. If another node was available to host the device group, then it should have been started on that node. If desired, it may be possible to switch the device group to this node with the scswitch command. If no other node was available, then the device group will not have been started. The scswitch command may be used to retry the attempt to start the device group. If the failure occurred during the 'primary_to_secondary' transition, then the shutdown or switchover of a device group has failed. The desired action may be retried.


767629 lkcm_reg: Unix DLM version (%d) and the OSD library version (%d) are not compatible. Unix DLM versions acceptable to this library are: %d

Description:

Unix DLM and Oracle DLM are not compatibale. Compatible versions will be printed as part of this message.

Solution:

Check installation procedure to make sure you have the correct versions of Oracle DLM and Unix DLM. Contact Sun service representative if versions cannot be resolved.


767858 in libsecurity unknown security type %d

Description:

This is an internal error which shouldn't occur. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


768676 Failed to access <%s>: <%s>

Description:

Solution:


770355 fatal: received signal %d

Description:

The daemon indicated in the message tag (rgmd or ucmmd) has received a SIGTERM signal, possibly caused by an operator-initiated kill(1) command. The daemon will produce a core file and will force the node to halt or reboot to avoid the possibility of data corruption.

Solution:

The operator must use scswitch(1M) and shutdown(1M) to take down a node, rather than directly killing the daemon.


770675 monitor_check: fe_method_full_name() failed for resource <%s>, resource group <%s>

Description:

During execution of a scha_control(1HA,3HA) function, the rgmd was unable to assemble the full method pathname for the MONITOR_CHECK method. This is considered a MONITOR_CHECK method failure. This in turn will prevent the attempted failover of the resource group from its current master to a new master.

Solution:

Examine other syslog messages occurring at about the same time to see if the problem can be identified. Save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance in diagnosing and correcting the problem.


770776 INTERNAL ERROR: process_resource: Resource <%s> is R_BOOTING in PENDING_ONLINE resource group

Description:

The rgmd is attempting to bring a resource group online on a node where BOOT methods are still being run on its resources. This should not occur and may indicate an internal logic error in the rgmd.

Solution:

Look for other syslog error messages on the same node. Save a copy of the /var/adm/messages files on all nodes, and report the problem to your authorized Sun service provider.


771340 fatal: Resource group <%s> update failed with error <%d>; aborting node

Description:

Rgmd failed to read updated resource group from the CCR on this node.

Solution:

Save a copy of the /var/adm/messages files on all nodes, and of the rgmd core file. Contact your authorized Sun service provider for assistance in diagnosing the problem.


772294 %s requests reconfiguration in step %s

Description:

Return status at the end of a step execution indicates that a reconfiguration is required.

Solution:

None.


772395 shutdown immediate did not succeed. (%s)

Description:

Failed to shutdown Oracle server using 'shutdown immediate' command.

Solution:

Examine 'Stop_timeout' property of the resource and increase 'Stop_timeout' if Oracle server takes long time to shutdown. and if you don't wish to use 'shutdown abort' for stopping Oracle server.


773078 Error in configuration file lookup (%s, ...): %s

Description:

Could not read configuration file udlm.conf.

Solution:

Make sure udlm.conf exists under /opt/SUNWudlm/etc and has the correct permissions.


773226 Server_url %s probe failed

Description:

The probing of the url set in the Server_url extension property failed. The agent probe will take action.

Solution:

None. The agent probe will take action. However, the cause of the failure should be investigated further. Examine the log file and syslog messages for additional information.


773366 thread create for hb_threadpool failed

Description:

The system was unable to create thread used for heartbeat processing.

Solution:

Take steps to increase memory availability. The installation of more memory will avoid the problem with a kernel inability to create threads. For a user level process problem: install more memory, increase swap space, or reduce the peak work load.


774752 reservation error(%s) - do_scsi3_inresv() error for disk %s

Description:

The device fencing program has encountered errors while trying to access a device. All retry attempts have failed.

Solution:

For the user action required by this message, see the user action for message 192619.


775696 Unable to unlock file: %s.


776199 (%s) reconfigure: cm error %s

Description:

ucmm reconfiguration failed.

Solution:

None if the next reconfiguration succeeds. If not, save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.


776339 INTERNAL ERROR: postpone_stop_r: meth type <%d>

Description:

A non-fatal internal error has occurred in the rgmd state machine.

Solution:

Since this problem might indicate an internal logic error in the rgmd, please save a copy of the /var/adm/messages files on all nodes, the output of an scstat -g command, and the output of a scrgadm -pvv command. Report the problem to your authorized Sun service provider.


778629 ERROR: MONITOR_STOP method is not registered for ONLINE resource <%s>

Description:

A non-fatal internal error has occurred in the rgmd state machine.

Solution:

Since this problem might indicate an internal logic error in the rgmd, please save a copy of the /var/adm/messages files on all nodes, the output of an scstat -g command, and the output of a scrgadm -pvv command. Report the problem to your authorized Sun service provider.


779073 in fe_set_env_vars malloc of env_name[%d] failed

Description:

The rgmd server was not able to allocate memory for an environment variable, while trying to connect to the rpc.fed server, possibly due to low memory. An error message is output to syslog.

Solution:

Investigate if the host is running out of memory. If not save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


779089 Could not start up DCS client because we could not contact the name server.

Description:

There was a fatal error while this node was booting.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


779544 "pmfctl -R": Error resuming pid %d for tag <%s>: %d

Description:

An error occured while rpc.pmfd attempted to resume the monitoring of the indicated pid, possibly because the indicated pid has exited while attempting to resume its monitoring.

Solution:

Check if the indicated pid has exited, if this is not the case, Save the syslog messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


780283 clcomm: Exception in coalescing region - Lost data

Description:

While supporting an invocation, the system wanted to combine buffers and failed. The system identifies the exception prior to this message.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


780539 Stopping fault monitor: %s:%ld:%s

Description:

Fault monitor has detected an error. Fault monitor will be stopped. Error detected by fault monitor and action taken by fault monitor is indicated in message.

Solution:

None


780792 Failed to retrieve the resource type information.

Description:

A Sun cluster data service has failed to retrieve the resource type's property information. Low memory or API call failure might be the reasons.

Solution:

In case of low memory, the problem will probably cured by rebooting. If the problem reoccurs, you might need to increase swap space by configuring additional swap devices. Otherwise, if it is API call failure, check the syslog messages from other components.


781445 kill -0: %s

Description:

The rpc.fed server is not able to send a signal to a tag that timed out, and the error message is shown. An error message is output to syslog.

Solution:

Save the syslog messages file. Examine other syslog messages occurring around the same time on the same node, to see if the cause of the problem can be identified.


781731 Failed to retrieve the cluster handle: %s.

Description:

An API operation has failed while retrieving the cluster information.

Solution:

This may be solved by rebooting the node. For more details about API failure, check the messages from other components.


782111 This list element in System property %s is missing a protocol: %s.

Description:

The system property that was named does not have a valid format. The value of the property must include a protocol.

Solution:

Add a protocol to the property value.


782694 The value returned for property %s for resource %s was invalid.

Description:

An unexpected value was returned for the named property.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


783130 Failed to retrieve the node id for node %s: %s.

Description:

API operation has failed while retrieving the node id for the given node.

Solution:

Check whether the node name is valid. For more information about API call failure, check the messages from other components.


783581 scvxvmlg fatal error - clconf_lib_init failed, returned %d

Description:

The program responsible for maintaining the VxVM namespace has suffered an internal error. If configuration changes were recently made to VxVM diskgroups or volumes, this node may be unaware of those changes. Recently created volumes may be unaccessible from this node.

Solution:

If no configuration changes have been recently made to VxVM diskgroups or volumes and all volumes continue to be accessible from this node, then no action is required. If changes have been made, the device namespace on this node can be updated to reflect those changes by executing '/usr/cluster/lib/dcs/scvxvmlg'. If the problem persists, contact your authorized Sun service provider to determine whether a workaround or patch is available.


784311 Network_resources_used property not set properly

Description:

There are probably more than 1 logical IP addresses in thisresource group and the Network_resources_used property is not properly set to associate the Resources to the appropriatebackend hosts.

Solution:

Set the Network_resources_used property for each resource in the RG to the logical IP address in the RG that is actually configured to run BV backend processes.


784560 resource %s status on node %s change to %s

Description:

This is a notification from the rgmd that a resource's fault monitor status has changed.

Solution:

This is an informational message, no user action is needed.


784607 Couldn't fork1.

Description:

The fork(1) system call failed.

Solution:

Some system resource has been exceeded. Install more memory, increase swap space or reduce peak memory consumption.


785003 clexecd: priocntl to set ts returned %d. Exiting.

Description:

clexecd program has encountered a failed priocltl(2) system call. The error message indicates the error number for the failure.

Solution:

clexecd program will exit and node will be halted or rebooted to prevent data corruption. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


785101 transition '%s' failed for cluster '%s': unknown code %d

Description:

The mentioned state transition failed for the cluster because of an unexpected command line option. udlmctl will exit.

Solution:

This is an internal error. Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.


785154 Could not look up IP because IP was NULL.

Description:

The mapping for the given ip address in the local host files can't be done: the specified ip address is NULL.

Solution:

Check whether the ip address has NULL value. If this is the case, recreate the resource with valid host name. If this is not the reason, treat it as an internal error and contact Sun service provider.


785213 reservation error(%s) - IOCDID_ISFIBRE failed for device %s, errno %d

Description:

The device fencing program has suffered an internal error.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available. Copies of /var/adm/messages from all nodes should be provided for diagnosis. It may be possible to retry the failed operation, depending on the nature of the error. If the message specifies the 'node_join' transition, then this node may be unable to access shared devices. If the failure occurred during the 'release_shared_scsi2' transition, then a node which was joining the cluster may be unable to access shared devices. In either case, it may be possible to reacquire access to shared devices by executing '/usr/cluster/lib/sc/run_reserve -c node_join' on all cluster nodes. If the failure occurred during the 'make_primary' transition, then a device group has failed to start on this node. If another node was available to host the device group, then it should have been started on that node. If desired, it may be possible to switch the device group to this node with the scswitch command. If no other node was available, then the device group will not have been started. The scswitch command may be used to retry the attempt to start the device group. If the failure occurred during the 'primary_to_secondary' transition, then the shutdown or switchover of a device group has failed. The desired action may be retried.


786114 Cannot access file: %s (%s)

Description:

Unable to access the file because of the indicated reason.

Solution:

Check that the file exists and has the correct permissions.


786412 reservation fatal error(UNKNOWN) - clconf_lib_init() error, returned %d

Description:

The device fencing program has suffered an internal error.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available. Copies of /var/adm/messages from all nodes should be provided for diagnosis. It may be possible to retry the failed operation, depending on the nature of the error. If the message specifies the 'node_join' transition, then this node may be unable to access shared devices. If the failure occurred during the 'release_shared_scsi2' transition, then a node which was joining the cluster may be unable to access shared devices. In either case, it may be possible to reacquire access to shared devices by executing '/usr/cluster/lib/sc/run_reserve -c node_join' on all cluster nodes. If the failure occurred during the 'make_primary' transition, then a device group has failed to start on this node. If another node was available to host the device group, then it should have been started on that node. If desired, it may be possible to switch the device group to this node with the scswitch command. If no other node was available, then the device group will not have been started. The scswitch command may be used to retry the attempt to start the device group. If the failure occurred during the 'primary_to_secondary' transition, then the shutdown or switchover of a device group has failed. The desired action may be retried.


786765 Failed to get host names from resource %s.

Description:

The networking information for the resource could not be retrieved.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


787063 Error in getting parameters for global service <%s> of path <%s>: %s

Description:

Can not get information of global service.

Solution:

Save a copy of /var/adm/messages and contact your authorized Sun service provider to determine what is the cause of the problem.


787529 NAFO group %s has status %s. The status of the NAFO group will be checked again in %d seconds.

Description:

The IPMP group named is in a transition state. The status will be checked again.

Solution:

This is an informational message, no user action is needed.


787616 Adapter %s is not a valid IPMP group on this node.

Description:

Validation of the adapter information has failed. The specified IPMP group does not exist on this node.

Solution:

Create appropriate IPMP group on this node or recreate the logical host with correct IPMP group.


789135 The Data base probe %s failed.The WLS probe will wait for the DB to be UP before starting the WLS

Description:

The Data base probe (set in the extension property db_probe_script) failed. The start method will not start the WLS. The probe method will wait till the DB probe succeeds before starting the WLS.

Solution:

Make sure the DB probe (set in db_probe_script) succeeds. Once the DB is started the WLS probe will start the WLS instance.


788145 gethostbyname() failed: %s.

Description:

gethostbyname() failed with unexpected error.

Solution:

Check if name service is configured correctly. Try some commands to query name serves, such as ping and nslookup, and correct the problem. If the error still persists, then reboot the node.


789223 lkcm_sync: caller is not registered

Description:

udlm is not registered with ucmm.

Solution:

This is an internal error. Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.


789460 monitor_check: call to rpc.fed failed for resource <%s>, resource group <%s>, method <%s>

Description:

A scha_control(1HA,3HA) GIVEOVER attempt failed, due to a failure of the rgmd to communicate with the rpc.fed daemon. If the rpc.fed process died, this might lead to a subsequent reboot of the node. Otherwise, this will prevent a resource group on the local node from failing over to an alternate primary node

Solution:

Examine other syslog messages occurring at about the same time to see if the problem can be identified and if it recurs. Save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance.


790758 Unable to open /dev/null: %s

Description:

While starting up, one of the rgmd daemons was not able to open /dev/null. The message contains the system error. This will prevent the daemon from starting on this node.

Solution:

Examine other syslog messages occurring at about the same time to see if the problem can be identified. Save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance in diagnosing and correcting the problem.


791495 Unregistered syscall (%d)

Description:

An internal error has occured. This should not happen. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


791577 Waiting for the host $i to startup.

Description:

Waiting for the specified host to startup.

Solution:

Bring the resource group containing the specified host online if it isnot yet running. If the resource group is already onlinethe probe will take appropriate action.


791959 Error: reg_evt missing correct names

Description:

Solution:


792109 Unable to set number of file descriptors.

Description:

rpc.pmfd was unable to set the number of file descriptors used in the RPC server.

Solution:

Look for other syslog error messages on the same node. Save a copy of the /var/adm/messages files on all nodes, and report the problem to your authorized Sun service provider.


792295 Some shared paths in file %s are invalid.


792338 The property %s must contain at least one value.

Description:

The named property does not have a legal value.

Solution:

Assign the property a value.


792683 clexecd: priocntl to set rt returned %d. Exiting.

Description:

clexecd program has encountered a failed priocltl(2) system call. The error message indicates the error number for the failure.

Solution:

clexecd program will exit and node will be halted or rebooted to prevent data corruption. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


792967 Unable to parse configuration file.

Description:

While parsing the Netscape configuration file an error occured in while either reading the file, or one of the fields within the file.

Solution:

Make sure that the appropriate configuration file is located in its default location with respect to the Confdir_list property.


793575 Adaptive server terminated.

Description:

Graceful shutdown did not succeed. Adaptive server processes were killed in STOP method.

Solution:

Please check the permissions of file specified in the STOP_FILE extension property. File should be executable by the Sybase owner and root user.


793651 Failed to parse xml for %s: %s

Description:

Solution:


794220 switchback: bad nodename <%s>

Description:

The rgmd encountered a bad node name in the Nodelist of a resource group it was trying to failback. This might indicate corruption of CCR data or rgmd in-memory state.

Solution:

Use scrgadm(1M) -pvv to examine resource group properties. If the values appear corrupted, the CCR might have to be rebuilt. If values appear correct, this may indicate an internal error in the rgmd. Please contact your authorized Sun service provider for assistance in diagnosing and correcting the problem.


794535 clcomm: Marshal Type mismatch. Expecting type %d got type %d

Description:

When MARSHAL_DEBUG is enabled, the system tags every data item marshalled to support an invocation. This reports that the current data item in the received message does not have the expected type. The received message format is wrong.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


795047 Stop fault monitor using pmfadm failed. tag %s error=%d

Description:

Failed to stop fault monitor will be stopped using Process Monitoring Facility (PMF), with the tag indicated in message. Error returned by PMF is indicated in message.

Solution:

Stop fault monitor processes. Please report this problem.


795062 Stop fault monitor using pmfadm failed. tag %s error=%s

Description:

Failed to stop fault monitor will be stopped using Process Monitoring Facility (PMF), with the tag indicated in message. Error returned by PMF is indicated in message.

Solution:

Stop fault monitor processes. Please report this problem.


795381 t_open: %s

Description:

Call to t_open() failed. The "t_open" man page describes possible error codes. udlm exits and the node will abort and panic.

Solution:

This is an internal error. Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.


795754 scha_control: resource <%s> restart request is rejected because the resource type <%s> must have START and STOP methods

Description:

A resource monitor (or some other program) is attempting to restart the indicated resource by calling scha_control(1ha),(3ha). This request is rejected because the resource type fails to declare both a START method and a STOP method. This represents a bug in the calling program, because the resource_restart feature can only be applied to resources that have STOP and START methods. Instead of attempting to restart the individual resource, the programmer may use scha_control(RESTART) to restart the resource group.

Solution:

The resource group may be restarted manually on the same node or switched to another node by using scswitch(1m) or the equivalent GUI command. Contact the author of the data service (or of whatever program is attempting to call scha_control) and report the error.


796536 Password file %s is not readable: %s

Description:

For the secure server to run, a password file named keypass is required. This file could not be read, which resulted in an error when trying to start the Data Service.

Solution:

Create the keypass file and place it under the Confdir_list path for this resource. Make sure that the file is readable.


796771 check_for_ccrdata failed malloc of size %d

Description:

Call to malloc failed. The "malloc" man page describes possible reasons.

Solution:

Install more memory, increase swap space or reduce peak memory consumption.


797604 CMM: Connectivity of quorum device %ld (%s) has been changed from 0x%llx to 0x%llx.

Description:

The number of configured paths to the specified quorum device has been changed as indicated. The connectivity information is depicted as bitmasks.

Solution:

This is an informational message, no user action is needed.


798060 Error opening procfs status file <%s> for tag <%s>: %s

Description:

The rpc.pmfd server was not able to open a procfs status file, and the system error is shown. procfs status files are required in order to monitor user processes.

Solution:

Investigate if the machine is running out of memory. If this is not the case, save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


798175 sema_wait: %s

Description:

The rpc.pmfd server was not able to act on a semaphore. The message contains the system error. The server does not perform the action requested by the client, and an error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


798318 Could not verify status of %s.

Description:

A critical method was unable to determine the status of the specified service or resource.

Solution:

Please examine other messages in the /var/adm/messages file to determine the cause of this problem. Also verify if the specified service or resource is available or not. If not available, start the service or resource and retry the operation which failed.


798514 Starting fault monitor. pmf tag %s

Description:

Informational message. Fault monitor is being started under control of Process Monitoring Facility (PMF), with the tag indicated in message.

Solution:

None


798658 Failed to get the resource type name: %s.

Description:

While retrieving the resource information, API operation has failed to retrieve the resource type name.

Solution:

This is internal error. Contact your authorized Sun service provider. For more error description, check the syslog messages.


799348 INTERNAL ERROR: MONITOR_START method is not registered for resource <%s>

Description:

A non-fatal internal error has occurred in the rgmd state machine.

Solution:

Since this problem might indicate an internal logic error in the rgmd, please save a copy of the /var/adm/messages files on all nodes, the output of an scstat -g command, and the output of a scrgadm -pvv command. Report the problem to your authorized Sun service provider.


799426 clcomm: can't ifkconfig private interface: %s:%d cmd %d error %d

Description:

The system failed to configure private network device for IP communications across the private interconnect of this device and IP address, resulting in the error identified in the message.

Solution:

Ensure that the network interconnect device is supported. Otherwise, Contact your authorized Sun service provider to determine whether a workaround or patch is available.


799817 Failed to stop the application using SIGTERM. Will try to stop using SIGKILL

Description:

The Application could not be stopped by sending SIGTERM. The STOP method will try to stop the application by sending SIGKILL with infinite timeout.

Solution:

None.