Failure of a Multiple-Owner Volume-Manager Framework Resource Group

Language:

This section describes problems that can affect the multiple-owner volume-manager framework resource group.

Node Panic During Initialization of the Multiple-Owner Volume-Manager Framework

If a fatal problem occurs during the initialization of the multiple-owner volume-manager framework, the node panics with an error messages similar to the following error message:

Note - When the node is a global-cluster node of the global cluster, the node panic brings down the entire machine.

Failure of the `vucmmd` Daemon to Start

The multiple-owner volume-manager framework daemon, vucmmd, manages the reconfiguration of the multiple-owner volume-manager framework. When a cluster is booted or rebooted, this daemon is started only after all components of the multiple-owner volume-manager framework are validated. If the validation of a component on a node fails, the vucmmd daemon fails to start on the node.

The most common causes of this problem are as follows:

An error occurred during a previous reconfiguration of a component of the multiple-owner volume-manager framework.
A step in a previous reconfiguration of the multiple-owner volume-manager framework timed out, causing the node on which the timeout occurred to panic.

For instructions to correct the problem, see How to Recover From a Failure of the vucmmd Daemon or a Related Component.

How to Recover From a Failure of the `vucmmd` Daemon or a Related Component

Perform this task to correct the problems that are described in the following sections:

Node Panic During Initialization of the Multiple-Owner Volume-Manager Framework
Failure of the vucmmd Daemon to Start

To determine the cause of the problem, examine the log files for multiple-owner volume-manager framework reconfigurations and the system messages file.
For the location of the log files for multiple-owner volume-manager framework reconfigurations, see Sources of Diagnostic Information.

When you examine these files, start at the most recent message and work backward until you identify the cause of the problem.

For more information about error messages that might indicate the cause of reconfiguration errors, see Oracle Solaris Cluster Error Messages Guide.
Correct the problem that caused the component to return an error to the multiple-owner volume-manager framework.

If the solution to the problem requires a reboot, reboot the node where the problem occurred.
The solution to only certain problems requires a reboot. For example, increasing the amount of shared memory requires a reboot. However, increasing the value of a step timeout does not require a reboot.

For more information about how to reboot a node, see Shutting Down and Booting a Single Node in a Cluster in Oracle Solaris Cluster System Administration Guide .
On the node where the problem occurred, take offline and bring online the multiple-owner volume-manager framework resource group.
This step refreshes the resource group with the configuration changes you made.
1. Assume the root role or assume a role that provides solaris.cluster.admin RBAC authorization.
2. Type the command to take offline the multiple-owner volume-manager framework resource group and its resources.
```
# clresourcegroup offline -n node vucmm-fmwk-rg
```
  –n node
  
  Specifies the node name or node identifier (ID) of the node where the problem occurred.
  
  vucmm-fmwk-rg
  
  Specifies the name of the resource group that is to be taken offline.
3. Type the command to bring online and in a managed state the multiple-owner volume-manager framework resource group and its resources.
```
# clresourcegroup online -eM -n node vucmm-fmwk-rg
```