Configuring Cluster Nodes With a Single, Dual-Port HBA

Language:

This section explains the use of dual-port host bus adapters (HBAs) to provide both connections to shared storage in the cluster. While Oracle Solaris Cluster supports this configuration, it is less redundant than the recommended configuration. You must understand the risks that a dual-port HBA configuration poses to the availability of your application, if you choose to use this configuration.

This section contains the following topics:

Risks and Trade-offs When Using One Dual-Port HBA

You should strive for as much separation and hardware redundancy as possible when connecting each cluster node to shared data storage. This approach provides the following advantages to your cluster:

The best assurance of high availability for your clustered application
Good failure isolation
Good maintenance robustness

Oracle Solaris Cluster is usually layered on top of a volume manager, mirrored data with independent I/O paths, or a multipathed I/O link to a hardware RAID arrangement. Therefore, the cluster software does not expect a node ever to ever lose access to shared data. These redundant paths to storage ensure that the cluster can survive any single failure.

Oracle Solaris Cluster does support certain configurations that use a single, dual-port HBA to provide the required two paths to the shared data. However, using a single, dual-port HBA for connecting to shared data increases the vulnerability of your cluster. If this single HBA fails and takes down both ports connected to the storage device, the node is unable to reach the stored data. How the cluster handles such a dual-port failure depends on several factors:

The cluster configuration
The volume manager configuration
The node on which the failure occurs
The state of the cluster when the failure occurs

If you choose one of these configurations for your cluster, you must understand that the supported configurations mitigate the risks to high availability and the other advantages. The supported configurations do not eliminate these previously mentioned risks.

Supported Configurations When Using a Single, Dual-Port HBA

Oracle Solaris Cluster supports the following volume manager configurations when you use a single, dual-port HBA for connecting to shared data:

Solaris Volume Manager with more than one disk in each disk set and no dual-string mediators configured. For details about this configuration, see Cluster Configuration When Using Solaris Volume Manager and a Single Dual-Port HBA.
Solaris Volume Manager for Oracle Solaris Cluster. For details about this configuration, see Cluster Configuration When Using Solaris Volume Manager for Oracle Solaris Cluster and a Single Dual-Port HBA.

Cluster Configuration When Using Solaris Volume Manager and a Single Dual-Port HBA

If the Solaris Volume Manager metadbs lose replica quorum for a disk set on a cluster node, the volume manager panics the cluster node. Oracle Solaris Cluster then takes over the disk set on a surviving node and your application fails over to a secondary node.

To ensure that the node panics and is fenced off if it loses its connection to shared storage, configure each metaset with at least two disks. In this configuration, the metadbs stored on the disks create their own replica quorum for each disk set.

Dual-string mediators are not supported in Solaris Volume Manager configurations that use a single dual-port HBA. Using dual-string mediators prevents the service from failing over to a new node.

Configuration Requirements

When configuring Solaris Volume Manager metasets, ensure that each metaset contains at least two disks. Do not configure dual-string mediators.

Expected Failure Behavior with Solaris Volume Manager

When a dual-port HBA fails with both ports in this configuration, the cluster behavior depends on whether the affected node is primary for the disk set.

If the affected node is primary for the disk set, Solaris Volume Manager panics that node because it lacks required state database replicas. Your cluster reforms with the nodes that achieve quorum and brings the disk set online on a new primary node.
If the affected node is not primary for the disk set, your cluster remains in a degraded state.

Failure Recovery with Solaris Volume Manager

Follow the instructions for replacing an HBA in your storage device documentation.

Cluster Configuration When Using Solaris Volume Manager for Oracle Solaris Cluster and a Single Dual-Port HBA

Because Solaris Volume Manager for Oracle Solaris Cluster uses raw disks only and is specific to Oracle Real Application Clusters (RAC), no special configuration is required.

Expected Failure Behavior with Solaris Volume Manager for Oracle Solaris Cluster

When a dual-port HBA fails and takes down both ports in this configuration, the cluster behavior depends on whether the affected node is the current master for the multi-owner disk set.

If the affected node is the current master for the multi-owner disk set, the node does not panic. If any other node fails or is rebooted, the affected node will panic when it tries to update the replicas. The volume manager chooses a new master for the disk set if the surviving nodes can achieve quorum.
If the affected node is not the current master for the multi-owner disk set, the node remains up but the device group is in a degraded state. If an additional failure affects the master node and Solaris Volume Manager for Oracle Solaris Cluster attempts to remaster the disk set on the node with the failed paths, that node will also panic. A new master will be chosen if any surviving nodes can achieve quorum.

Failure Recovery with Solaris Volume Manager for Oracle Solaris Cluster

Follow the instructions for replacing an HBA in your storage device documentation.