This section describes problems that can affect the Support for Oracle RAC framework resource group.
If a fatal problem occurs during the initialization of Support for Oracle RAC, the node panics with an error messages similar to the following error message:
panic[cpu0]/thread=40037e60: Failfast: Aborting because "ucmmd" died 30 seconds ago
Description: A component that the UCMM controls returned an error to the UCMM during a reconfiguration.
Cause: The most common causes of this problem are as follows: A node might also panic during the initialization of Support for Oracle RAC because a reconfiguration step has timed out. For more information, see Node Panic Caused by a Timeout.
Solution: For instructions to correct the problem, see How to Recover From a Failure of the ucmmd Daemon or a Related Component.
The UCMM daemon, ucmmd, manages the reconfiguration of Support for Oracle RAC. When a cluster is booted or rebooted, this daemon is started only after all components of Support for Oracle RAC are validated. If the validation of a component on a node fails, the ucmmd daemon fails to start on the node.
The most common causes of this problem are as follows:
An error occurred during a previous reconfiguration of a component of Support for Oracle RAC.
A step in a previous reconfiguration of Support for Oracle RAC timed out, causing the node on which the timeout occurred to panic.
For instructions to correct the problem, see How to Recover From a Failure of the ucmmd Daemon or a Related Component.
Perform this task to correct the problems that are described in the following sections:
For the location of the log files for UCMM reconfigurations, see Sources of Diagnostic Information.
When you examine these files, start at the most recent message and work backward until you identify the cause of the problem.
For more information about error messages that might indicate the cause of reconfiguration errors, see Oracle Solaris Cluster Error Messages Guide.
For example:
For more information, see Node Panic Caused by a Timeout.
The solution to only certain problems requires a reboot. For example, increasing the amount of shared memory requires a reboot. However, increasing the value of a step timeout does not require a reboot.
For more information about how to reboot a node, see Shutting Down and Booting a Single Node in a Cluster in Oracle Solaris Cluster System Administration Guide .
This step refreshes the resource group with the configuration changes you made.
# clresourcegroup offline -n node rac-fmwk-rg
Specifies the node name or node identifier (ID) of the node where the problem occurred.
Specifies the name of the resource group that is to be taken offline.
# clresourcegroup online -eM -n node rac-fmwk-rg