If you encounter a problem with Sun Cluster Support for Oracle RAC, troubleshoot the problem by using the techniques that are described in the following sections.
The status of resource groups and resources for Sun Cluster Support for Oracle RAC indicates the status of Oracle RAC in your cluster. Use Sun Cluster maintenance commands to obtain this status information.
To obtain status information for resource groups, use the clresourcegroup(1CL) command.
To obtain status information for resources, use the clresource(1CL) command.
This procedure provides the long forms of the Sun Cluster maintenance commands. Most commands also have short forms. Except for the forms of the command names, the commands are identical. For a list of the commands and their short forms, see Appendix A, Sun Cluster Object-Oriented Commands, in Sun Cluster Data Services Planning and Administration Guide for Solaris OS.
Become superuser or assume a role that provides solaris.cluster.read RBAC authorization.
Display status information for the Sun Cluster objects in which you are interested.
For example:
To display status information for all resource groups in your cluster, type the following command:
# clresourcegroup status + |
To display status information for all resources in a resource group, type the following command:
# clresource status -g resource-group + |
Specifies the resource group that contains the resources whose status information you are displaying.
For information about options that you can specify to filter the status information that is displayed, see the following man pages:
The following examples show the status of resource groups and resources for a configuration of Sun Cluster Support for Oracle RAC on a four-node cluster. Each node is a machine that uses the SPARC® processor.
The cluster in this example is running version 10g Release 2 or 11g of Oracle RAC. The configuration in this example uses a Sun QFS shared file system on Solaris Volume Manager for Sun Cluster to store Oracle files. The configuration includes a multiple-owner volume-manager framework resource group to contain the volume manager resource.
The resource groups and resources for this configuration are shown in the following table.
Resource Group |
Purpose |
Resource Group Contents |
|
---|---|---|---|
Resource Type |
Resource Instance Name |
||
rac-framework-rg |
RAC framework resource group |
SUNW.rac_framework |
rac-framework-rs |
SUNW.rac_udlm |
rac-udlm-rs |
||
SUNW.crs_framework |
crs_framework-rs |
||
vucmm-framework-rg |
multiple-owner volume-manager framework resource group |
SUNW.vucmm_framework |
vucmm-framework-rs |
SUNW.vucmm_svm |
vucmm-svm-rs |
||
scaldg-rg |
Resource group for scalable device-group resources |
SUNW.ScalDeviceGroup |
scaloradg-rs |
qfsmds-rg |
Resource group for Sun QFS metadata server resources |
SUNW.qfs |
qfs-db_qfs-OraHome-rs qfs-db_qfs-OraData-rs |
scalmnt-rg |
Resource group for scalable file-system mount-point resources |
SUNW.ScalMountPoint |
scal-db_qfs-OraHome-rs scal-db_qfs-OraData-rs |
rac_server_proxy-rg |
RAC database resource group |
SUNW.scalable_rac_server_proxy |
rac_server_proxy-rs |
This example provides the following status information for a RAC framework resource group that is faulty.
A reconfiguration error has prevented the rac_framework resource from starting on cluster node pclus1.
The effects of this reconfiguration error on resource groups and other resources are as follows:
The rac-framework-rg resource group is offline and in the start failed state on cluster node pclus1.
The rac_udlm resource, the rac_svm resource, and the crs_framework resource are offline on cluster node pclus1.
All other multi-master resource groups and the resources that the groups contain are offline on cluster node pclus1.
All failover resource groups have failed over from cluster node pclus1 to a secondary node.
All multi-master resource groups and the resources that the groups contain are online on the remaining nodes.
# clresourcegroup status + === Cluster Resource Groups === Group Name Node Name Suspended Status ---------- --------- --------- ------ rac-framework-rg pclus1 No Online faulted pclus2 No Online pclus3 No Online pclus4 No Online vucmm-framework-rg pclus1 No Online pclus2 No Online pclus3 No Online pclus4 No Online scaldg-rg pclus1 No Online pclus2 No Online pclus3 No Online pclus4 No Online qfsmds-rg pclus1 No Offline pclus2 No Online pclus3 No Offline pclus4 No Offline scalmnt-rg pclus1 No Online pclus2 No Online pclus3 No Online pclus4 No Online rac_server_proxy-rg pclus1 No Pending online blocked pclus2 No Online pclus3 No Online pclus4 No Online # clresource status -g rac-framework-rg + === Cluster Resources === Resource Name Node Name State Status Message ------------- --------- ----- -------------- rac-framework-rs pclus1 Start failed Faulted - Error in previous reconfiguration. pclus2 Online Online pclus3 Online Online pclus4 Online Online rac-udlm-rs pclus1 Offline Offline pclus2 Online Online pclus3 Online Online pclus4 Online Online crs_framework-rs pclus1 Offline Offline pclus2 Online Online pclus3 Online Online pclus4 Online Online # clresource status -g vucmm-framework-rg + === Cluster Resources === Resource Name Node Name State Status Message ------------- --------- ----- -------------- vucmm-framework-rs pclus1 Online Online pclus2 Online Online pclus3 Online Online pclus4 Online Online vucmm-svm-rs pclus1 Offline Offline pclus2 Online Online pclus3 Online Online pclus4 Online Online # clresource status -g scaldg-rg + === Cluster Resources === Resource Name Node Name State Status Message ------------- --------- ----- -------------- scaloradg-rs pclus1 Online Online - Diskgroup online pclus2 Online Online - Diskgroup online pclus3 Online Online - Diskgroup online pclus4 Online Online - Diskgroup online # clresource status -g qfsmds-rg + === Cluster Resources === Resource Name Node Name State Status Message ------------- --------- ----- -------------- qfs-db_qfs-OraHome-rs pclus1 Offline Offline pclus2 Online Online - Service is online. pclus3 Offline Offline pclus4 Offline Offline qfs-db_qfs-OraData-rs pclus1 Offline Offline pclus2 Online Online - Service is online. pclus3 Offline Offline pclus4 Offline Offline # clresource status -g scalmnt-rg + === Cluster Resources === Resource Name Node Name State Status Message ------------- --------- ----- -------------- scal-db_qfs-OraHome-rs pclus1 Online Online pclus2 Online Online pclus3 Online Online pclus4 Online Online scal-db_qfs-OraData-rs pclus1 Online Online pclus2 Online Online pclus3 Online Online pclus4 Online Online # clresource status -g rac_server_proxy-rg + === Cluster Resources === Resource Name Node Name State Status Message ------------- --------- ----- -------------- rac_server_proxy-rs pclus1 Offline Offline pclus2 Online Online - Oracle instance UP pclus3 Online Online - Oracle instance UP pclus4 Online Online - Oracle instance UP |
This example provides the following status information for a RAC database resource group that is faulty:
The RAC database on pclus1 has failed to start. The effects of this failure are as follows:
The rac_server_proxy-rg resource group is online, but faulted on node pclus1.
The rac_server_proxy-rs resource is offline on node pclus1.
All other multi-master resource groups and the resources that the groups contain are online on all nodes.
All failover resource groups and the resources that the groups contain are online on their primary nodes and offline on the remaining nodes.
# clresourcegroup status + === Cluster Resource Groups === Group Name Node Name Suspended Status ---------- --------- --------- ------ rac-framework-rg pclus1 No Online pclus2 No Online pclus3 No Online pclus4 No Online vucmm-framework-rg pclus1 No Online pclus2 No Online pclus3 No Online pclus4 No Online scaldg-rg pclus1 No Online pclus2 No Online pclus3 No Online pclus4 No Online qfsmds-rg pclus1 No Online pclus2 No Offline pclus3 No Offline pclus4 No Offline scalmnt-rg pclus1 No Online pclus2 No Online pclus3 No Online pclus4 No Online rac_server_proxy-rg pclus1 No Online faulted pclus2 No Online pclus3 No Online pclus4 No Online # clresource status -g rac_server_proxy-rg + === Cluster Resources === Resource Name Node Name State Status Message ------------- --------- ----- -------------- rac_server_proxy-rs pclus1 Offline Offline - Oracle instance DOWN pclus2 Online Online - Oracle instance UP pclus3 Online Online - Oracle instance UP pclus4 Online Online - Oracle instance UP # clresource status -g rac-framework-rg + === Cluster Resources === Resource Name Node Name State Status Message ------------- --------- ----- -------------- rac-framework-rs pclus1 Online Online pclus2 Online Online pclus3 Online Online pclus4 Online Online rac-udlm-rs pclus1 Online Online pclus2 Online Online pclus3 Online Online pclus4 Online Online crs_framework-rs pclus1 Online Online pclus2 Online Online pclus3 Online Online pclus4 Online Online # clresource status -g vucmm-framework-rg + === Cluster Resources === Resource Name Node Name State Status Message ------------- --------- ----- -------------- vucmm-framework-rs pclus1 Online Online pclus2 Online Online pclus3 Online Online pclus4 Online Online vucmm-svm-rs pclus1 Online Online pclus2 Online Online pclus3 Online Online pclus4 Online Online # clresource status -g scaldg-rg + === Cluster Resources === Resource Name Node Name State Status Message ------------- --------- ----- -------------- scaloradg-rs pclus1 Online Online - Diskgroup online pclus2 Online Online - Diskgroup online pclus3 Online Online - Diskgroup online pclus4 Online Online - Diskgroup online # clresource status -g qfsmds-rg + === Cluster Resources === Resource Name Node Name State Status Message ------------- --------- ----- -------------- qfs-db_qfs-OraHome-rs pclus1 Online Online - Service is online. pclus2 Offline Offline pclus3 Offline Offline pclus4 Offline Offline qfs-db_qfs-OraData-rs pclus1 Online Online - Service is online. pclus2 Offline Offline pclus3 Offline Offline pclus4 Offline Offline # clresource status -g scalmnt-rg + === Cluster Resources === Resource Name Node Name State Status Message ------------- --------- ----- -------------- scal-db_qfs-OraHome-rs pclus1 Online Online pclus2 Online Online pclus3 Online Online pclus4 Online Online scal-db_qfs-OraData-rs pclus1 Online Online pclus2 Online Online pclus3 Online Online pclus4 Online Online |
This example shows the status of an Oracle RAC configuration that is operating correctly. The example indicates that the status of resource groups and resources in this configuration is as follows:
All multi-master resource groups and the resources that the groups contain are online on all nodes.
All failover resource groups and the resources that the groups contain are online on their primary nodes and offline on the remaining nodes.
# clresourcegroup status + === Cluster Resource Groups === Group Name Node Name Suspended Status ---------- --------- --------- ------ rac-framework-rg pclus1 No Online pclus2 No Online pclus3 No Online pclus4 No Online vucmm-framework-rg pclus1 No Online pclus2 No Online pclus3 No Online pclus4 No Online scaldg-rg pclus1 No Online pclus2 No Online pclus3 No Online pclus4 No Online qfsmds-rg pclus1 No Online pclus2 No Offline pclus3 No Offline pclus4 No Offline scalmnt-rg pclus1 No Online pclus2 No Online pclus3 No Online pclus4 No Online rac_server_proxy-rg pclus1 No Online pclus2 No Online pclus3 No Online pclus4 No Online # clresource status -g rac-framework-rg + === Cluster Resources === Resource Name Node Name State Status Message ------------- --------- ----- -------------- rac-framework-rs pclus1 Online Online pclus2 Online Online pclus3 Online Online pclus4 Online Online rac-udlm-rs pclus1 Online Online pclus2 Online Online pclus3 Online Online pclus4 Online Online crs_framework-rs pclus1 Online Online pclus2 Online Online pclus3 Online Online pclus4 Online Online # clresource status -g vucmm-framework-rg + === Cluster Resources === Resource Name Node Name State Status Message ------------- --------- ----- -------------- vucmm-framework-rs pclus1 Online Online pclus2 Online Online pclus3 Online Online pclus4 Online Online vucmm-svm-rs pclus1 Online Online pclus2 Online Online pclus3 Online Online pclus4 Online Online # clresource status -g scaldg-rg + === Cluster Resources === Resource Name Node Name State Status Message ------------- --------- ----- -------------- scaloradg-rs pclus1 Online Online - Diskgroup online pclus2 Online Online - Diskgroup online pclus3 Online Online - Diskgroup online pclus4 Online Online - Diskgroup online # clresource status -g qfsmds-rg + === Cluster Resources === Resource Name Node Name State Status Message ------------- --------- ----- -------------- qfs-db_qfs-OraHome-rs pclus1 Online Online - Service is online. pclus2 Offline Offline pclus3 Offline Offline pclus4 Offline Offline qfs-db_qfs-OraData-rs pclus1 Online Online - Service is online. pclus2 Offline Offline pclus3 Offline Offline pclus4 Offline Offline # clresource status -g scalmnt-rg + === Cluster Resources === Resource Name Node Name State Status Message ------------- --------- ----- -------------- scal-db_qfs-OraHome-rs pclus1 Online Online pclus2 Online Online pclus3 Online Online pclus4 Online Online scal-db_qfs-OraData-rs pclus1 Online Online pclus2 Online Online pclus3 Online Online pclus4 Online Online # clresource status -g rac_server_proxy-rg + === Cluster Resources === Resource Name Node Name State Status Message ------------- --------- ----- -------------- rac_server_proxy-rs pclus1 Online Online - Oracle instance UP pclus2 Online Online - Oracle instance UP pclus3 Online Online - Oracle instance UP pclus4 Online Online - Oracle instance UP |
If the state of a scalable device group resource or a file-system mount-point resource changes, the new state is logged through the syslog(3C) function.
The directories /var/cluster/ucmm and /var/cluster/vucmm contain the sources of diagnostic information that are shown in the following table.
The directory /var/opt/SUNWscor/oracle_server/proxyresource contains log files for the resource that represents the Oracle 10g Release 2 or 11g RAC proxy server. Messages for server-side components and client-side components of the proxy server resource are written to separate files:
Messages for server-side components are written to the file message_log.resource.
Messages for client-side components are written to the file message_log.client.resource.
In these file names and directory names, resource is the name of the resource that represents the Oracle RAC server component.
The directory /var/opt/SUNWscor/oracle_server contains log files for the Oracle 9i RAC server resource. Each file is named /var/opt/SUNWscor/oracle_server/message_log.resource.
The system messages file also contains diagnostic information.
If a problem occurs with Sun Cluster Support for Oracle RAC, consult these files to obtain information about the cause of the problem.
The subsections that follow describe problems that can affect Sun Cluster Support for Oracle RAC. Each subsection provides information about the cause of the problem and a solution to the problem.
Failure of a Multiple-Owner Volume-Manager Framework Resource Group
SUNW.qfs Registration Fails Because the Registration File Is Not Found
Failure of a SUNW.rac_framework or SUNW.vucmm_framework Resource to Start
This section describes problems that can affect the RAC framework resource group.
Node Panic During Initialization of Sun Cluster Support for Oracle RAC
How to Recover From a Failure of the ucmmd Daemonor a Related Component
If a fatal problem occurs during the initialization of Sun Cluster Support for Oracle RAC, the node panics with an error messages similar to the following error message:
panic[cpu0]/thread=40037e60: Failfast: Aborting because "ucmmd" died 30 seconds ago
Description:A component that the UCMM controls returned an error to the UCMM during a reconfiguration.
Cause:The most common causes of this problem are as follows:
SPARC: The ORCLudlm package that contains the Oracle UDLM is not installed.
SPARC: The version of the Oracle UDLM is incompatible with the version of Sun Cluster Support for Oracle RAC.
SPARC: The amount of shared memory is insufficient to enable the Oracle UDLM to start.
A node might also panic during the initialization of Sun Cluster Support for Oracle RAC because a reconfiguration step has timed out. For more information, see Node Panic Caused by a Timeout.
Solution:For instructions to correct the problem, see How to Recover From a Failure of the ucmmd Daemonor a Related Component.
When the node is a global-cluster voting node of the global cluster, the node panic brings down the entire machine. When the node is a zone-cluster node, the node panic brings down only that specific zone and other zones remain unaffected.
The UCMM daemon, ucmmd, manages the reconfiguration of Sun Cluster Support for Oracle RAC. When a cluster is booted or rebooted, this daemon is started only after all components of Sun Cluster Support for Oracle RAC are validated. If the validation of a component on a node fails, the ucmmd daemon fails to start on the node.
The most common causes of this problem are as follows:
SPARC: The ORCLudlm package that contains the Oracle UDLM is not installed.
An error occurred during a previous reconfiguration of a component Sun Cluster Support for Oracle RAC.
A step in a previous reconfiguration of Sun Cluster Support for Oracle RAC timed out, causing the node on which the timeout occurred to panic.
For instructions to correct the problem, see How to Recover From a Failure of the ucmmd Daemonor a Related Component.
Perform this task to correct the problems that are described in the following sections:
This procedure provides the long forms of the Sun Cluster maintenance commands. Most commands also have short forms. Except for the forms of the command names, the commands are identical. For a list of the commands and their short forms, see Appendix A, Sun Cluster Object-Oriented Commands, in Sun Cluster Data Services Planning and Administration Guide for Solaris OS.
To determine the cause of the problem, examine the log files for UCMM reconfigurations and the system messages file.
For the location of the log files for UCMM reconfigurations, see Sources of Diagnostic Information.
When you examine these files, start at the most recent message and work backward until you identify the cause of the problem.
For more information about error messages that might indicate the cause of reconfiguration errors, see Sun Cluster Error Messages Guide for Solaris OS.
Correct the problem that caused the component to return an error to the UCMM.
For example:
SPARC: If your Oracle release requires Oracle UDLM and the ORCLudlm package that contains the Oracle UDLM is not installed, ensure that the package is installed.
Oracle UDLM is required only when it is actually used.
Ensure that you have completed all the procedures that precede installing and configuring the Oracle UDLM software.
The procedures that you must complete are listed in Table 1–1.
Ensure that the Oracle UDLM software is correctly installed and configured.
For more information, see SPARC: Installing the Oracle UDLM.
SPARC: If the version of the Oracle UDLM is incompatible with the version of Sun Cluster Support for Oracle RAC, install a compatible version of the package.
For more information, see SPARC: Installing the Oracle UDLM.
SPARC: If the amount of shared memory is insufficient to enable the Oracle UDLM to start, increase the amount of shared memory.
For more information, see How to Configure Shared Memory for the Oracle RAC Software in the Global Cluster.
If a reconfiguration step has timed out, increase the value of the extension property that specifies the timeout for the step.
For more information, see Node Panic Caused by a Timeout.
If the solution to the problem requires a reboot, reboot the node where the problem occurred.
The solution to only certain problems requires a reboot. For example, increasing the amount of shared memory requires a reboot. However, increasing the value of a step timeout does not require a reboot.
For more information about how to reboot a node, see Shutting Down and Booting a Single Node in a Cluster in Sun Cluster System Administration Guide for Solaris OS.
On the node where the problem occurred, take offline and bring online the RAC framework resource group.
This step refreshes the resource group with the configuration changes you made.
Become superuser or assume a role that provides solaris.cluster.admin RBAC authorization.
Type the command to take offline the RAC framework resource group and its resources.
# clresourcegroup offline -n node rac-fmwk-rg |
Specifies the node name or node identifier (ID) of the node where the problem occurred.
Specifies the name of the resource group that is to be taken offline.
Type the command to bring online and in a managed state the RAC framework resource group and its resources.
# clresourcegroup online -emM -n node rac-fmwk-rg |
This section describes problems that can affect the multiple-owner volume-manager framework resource group.
Node Panic During Initialization of the Multiple-Owner Volume-Manager Framework
How to Recover From a Failure of the vucmmd Daemon or a Related Component
If a fatal problem occurs during the initialization of the multiple-owner volume-manager framework, the node panics with an error messages similar to the following error message:
When the node is a global-cluster voting node of the global cluster, the node panic brings down the entire machine.
panic[cpu0]/thread=40037e60: Failfast: Aborting because "vucmmd" died 30 seconds ago
Description:A component that the multiple-owner volume-manager framework controls returned an error to the multiple-owner volume-manager framework during a reconfiguration.
Cause:The most common causes of this problem is that the license for Veritas Volume Manager (VxVM) is missing or has expired.
A node might also panic during the initialization of the multiple-owner volume-manager framework because a reconfiguration step has timed out. For more information, see Node Panic Caused by a Timeout.
Solution:For instructions to correct the problem, see How to Recover From a Failure of the vucmmd Daemon or a Related Component.
The multiple-owner volume-manager framework daemon, vucmmd, manages the reconfiguration of the multiple-owner volume-manager framework. When a cluster is booted or rebooted, this daemon is started only after all components of the multiple-owner volume-manager framework are validated. If the validation of a component on a node fails, the vucmmd daemon fails to start on the node.
The most common causes of this problem are as follows:
An error occurred during a previous reconfiguration of a component of the multiple-owner volume-manager framework.
A step in a previous reconfiguration of the multiple-owner volume-manager framework timed out, causing the node on which the timeout occurred to panic.
For instructions to correct the problem, see How to Recover From a Failure of the vucmmd Daemon or a Related Component.
Perform this task to correct the problems that are described in the following sections:
This procedure provides the long forms of the Sun Cluster maintenance commands. Most commands also have short forms. Except for the forms of the command names, the commands are identical. For a list of the commands and their short forms, see Appendix A, Sun Cluster Object-Oriented Commands, in Sun Cluster Data Services Planning and Administration Guide for Solaris OS.
To determine the cause of the problem, examine the log files for multiple-owner volume-manager framework reconfigurations and the system messages file.
For the location of the log files for multiple-owner volume-manager framework reconfigurations, see Sources of Diagnostic Information.
When you examine these files, start at the most recent message and work backward until you identify the cause of the problem.
For more information about error messages that might indicate the cause of reconfiguration errors, see Sun Cluster Error Messages Guide for Solaris OS.
Correct the problem that caused the component to return an error to the multiple-owner volume-manager framework .
For example:
If the license for VxVM is missing or has expired, ensure that VxVM is correctly installed and licensed.
Verify that you have correctly installed your volume manager packages.
If you are using VxVM, check that you have installed the software and check that the license for the VxVM cluster feature is valid.
A zone cluster does not support VxVM.
If a reconfiguration step has timed out, increase the value of the extension property that specifies the timeout for the step.
For more information, see Node Panic Caused by a Timeout.
If the solution to the problem requires a reboot, reboot the node where the problem occurred.
The solution to only certain problems requires a reboot. For example, increasing the amount of shared memory requires a reboot. However, increasing the value of a step timeout does not require a reboot.
For more information about how to reboot a node, see Shutting Down and Booting a Single Node in a Cluster in Sun Cluster System Administration Guide for Solaris OS.
On the node where the problem occurred, take offline and bring online the multiple-owner volume-manager framework resource group.
This step refreshes the resource group with the configuration changes you made.
Become superuser or assume a role that provides solaris.cluster.admin RBAC authorization.
Type the command to take offline the multiple-owner volume-manager framework resource group and its resources.
# clresourcegroup offline -n node vucmm-fmwk-rg |
Specifies the node name or node identifier (ID) of the node where the problem occurred.
Specifies the name of the resource group that is to be taken offline.
Type the command to bring online and in a managed state the multiple-owner volume-manager framework resource group and its resources.
# clresourcegroup online -emM -n node vucmm-fmwk-rg |
Sun Cluster resource-type registration files are located in the /opt/cluster/lib/rgm/rtreg/ or /usr/cluster/lib/rgm/rtreg/ directory. The SUNW.qfs resource-type registration file is located in the /opt/SUNWsamfs/sc/etc/ directory.
If Sun Cluster software is already installed when you install Sun QFS software, the necessary mapping to the SUNW.qfs registration file is automatically created. But if Sun Cluster software is not already installed when you install Sun QFS software, the necessary mapping to the SUNW.qfs registration file is not made, even when Sun Cluster software is later installed. Attempts to register the SUNW.qfs resource type therefore fail because the Sun Cluster software is unaware of the location of its registration file.
To enable Sun Cluster software to locate the SUNW.qfs resource type, create a symbolic link to the directory:
# cd /usr/cluster/lib/rgm/rtreg # ln -s /opt/SUNWsamfs/sc/etc/SUNW.qfs SUNW.qfs |
The timing out of any step in the reconfiguration of Sun Cluster Support for Oracle RAC causes the node on which the timeout occurred to panic.
To prevent reconfiguration steps from timing out, tune the timeouts that depend on your cluster configuration. For more information, see Guidelines for Setting Timeouts.
If a reconfiguration step times out, use the Sun Cluster maintenance commands to increase the value of the extension property that specifies the timeout for the step. For more information, see Appendix C, Sun Cluster Support for Oracle RAC Extension Properties.
After you have increased the value of the extension property, bring online the RAC framework resource group on the node that panicked.
If a SUNW.rac_framework or SUNW.vucmm_frameworkresource fails to start, verify the status of the resource to determine the cause of the failure. For more information, see How to Verify the Status of Sun Cluster Support for Oracle RAC.
The state of a resource that failed to start is shown as Start failed. The associated status message indicates the cause of the failure to start.
This section contains the following information:
The following status messages are associated with the failure of a SUNW.rac_framework resource to start:
Faulted - ucmmd is not running
Description:The ucmmd daemon is not running on the node where the resource resides.
Solution:For information about how to correct this problem, see Failure of the ucmmd Daemon to Start.
Degraded - reconfiguration in progress
Description:The UCMM is undergoing a reconfiguration. This message indicates a problem only if the reconfiguration of the UCMM is not completed and the status of this resource persistently remains degraded.
Cause:If this message indicates a problem, the cause of the failure is a configuration error in one or more components of Sun Cluster Support for Oracle RAC.
Solution:The solution to this problem depends on whether the message indicates a problem:
If the message indicates a problem, correct the problem as explained in How to Recover From a Failure of the ucmmd Daemonor a Related Component.
If the message does not indicate a problem, no action is required.
Reconfiguration of Oracle RAC was not completed until after the START method of the SUNW.rac_framework resource timed out.
Solution:For instructions to correct the problem, see How to Recover From the Timing Out of the START Method.
The following status messages are associated with the failure of a SUNW.vucmm_framework resource to start:
Faulted - vucmmd is not running
Description:The vucmmd daemon is not running on the node where the resource resides.
Solution:For information about how to correct this problem, see Failure of the vucmmd Daemon to Start.
Degraded - reconfiguration in progress
Description:The multiple-owner volume-manager framework is undergoing a reconfiguration. This message indicates a problem only if the reconfiguration of the multiple-owner volume-manager framework is not completed and the status of this resource persistently remains degraded.
Cause:If this message indicates a problem, the cause of the failure is a configuration error in one or more components of the volume manager reconfiguration framework.
Solution:The solution to this problem depends on whether the message indicates a problem:
If the message indicates a problem, correct the problem as explained in How to Recover From a Failure of the vucmmd Daemon or a Related Component.
If the message does not indicate a problem, no action is required.
Reconfiguration of Oracle RAC was not completed until after the START method of the SUNW.vucmm_framework resource timed out.
Solution:For instructions to correct the problem, see How to Recover From the Timing Out of the START Method.
This procedure provides the long forms of the Sun Cluster maintenance commands. Most commands also have short forms. Except for the forms of the command names, the commands are identical. For a list of the commands and their short forms, see Appendix A, Sun Cluster Object-Oriented Commands, in Sun Cluster Data Services Planning and Administration Guide for Solaris OS.
Become superuser or assume a role that provides solaris.cluster.admin RBAC authorization.
On the node where the START method timed out, take offline the framework resource group that failed to start.
To perform this operation, switch the primary nodes of the resource group to the other nodes where the group is online.
# clresourcegroup offline -n nodelist resource-group |
Specifies a comma-separated list of other cluster nodes on which resource-group is online. Omit from this list the node where the START method timed out.
Specifies the name of the framework resource group.
If your configuration uses both a multiple-owner volume-manager framework resource group and a RAC framework resource group, first take offline the multiple-owner volume-manager framework resource group. When the multiple-owner volume-manager framework resource group is offline, then take offline the RAC framework resource group.
If the RAC resource group was created by using the clsetup utility, the name of the resource group is rac-framework-rg.
On all cluster nodes that can run Sun Cluster Support for Oracle RAC, bring online the framework resource group that failed to come online.
# clresourcegroup online resource-group |
Specifies that the resource group that you brought offline in Step 2 is to be moved to the MANAGED state and brought online.
If a resource fails to stop, correct this problem as explained in Clearing the STOP_FAILED Error Flag on Resources in Sun Cluster Data Services Planning and Administration Guide for Solaris OS.