Sun Cluster Concepts Guide for Solaris OS

Disk Device Groups

In the Sun Cluster system, all multihost devices must be under control of the Sun Cluster software. You first create volume manager disk groups—either Solaris Volume Manager disk sets or VERITAS Volume Manager disk groups (available for use in only SPARC based clusters)—on the multihost disks. Then, you register the volume manager disk groups as disk device groups. A disk device group is a type of global device. In addition, the Sun Cluster software automatically creates a raw disk device group for each disk and tape device in the cluster. However, these cluster device groups remain in an offline state until you access them as global devices.

Registration provides the Sun Cluster system information about which nodes have a path to specific volume manager disk groups. At this point, the volume manager disk groups become globally accessible within the cluster. If more than one node can write to (master) a disk device group, the data stored in that disk device group becomes highly available. The highly available disk device group can be used to contain cluster file systems.

Note –

Disk device groups are independent of resource groups. One node can master a resource group (representing a group of data service processes) while another can master the disk groups that are being accessed by the data services. However, the best practice is to keep on the same node the disk device group that stores a particular application's data and the resource group that contains the application's resources (the application daemon). Refer to Relationship Between Resource Groups and Disk Device Groups in Sun Cluster Data Services Planning and Administration Guide for Solaris OS for more information about the association between disk device groups and resource groups.

When a node uses a disk device group, the volume manager disk group becomes “global” because it provides multipath support to the underlying disks. Each cluster node that is physically attached to the multihost disks provides a path to the disk device group.

Disk Device Group Failover

Because a disk enclosure is connected to more than one node, all disk device groups in that enclosure are accessible through an alternate path if the node currently mastering the device group fails. The failure of the node mastering the device group does not affect access to the device group except for the time it takes to perform the recovery and consistency checks. During this time, all requests are blocked (transparently to the application) until the system makes the device group available.

Figure 3–1 Disk Device Group Before and After Failover

Illustration: The preceding context describes the graphic.

Multiported Disk Device Groups

This section describes disk device group properties that enable you to balance performance and availability in a multiported disk configuration. Sun Cluster software provides two properties used to configure a multiported disk configuration: preferenced and numsecondaries. You can control the order in which nodes attempt to assume control if a failover occurs by using the preferenced property. Use the numsecondaries property to set a desired number of secondary nodes for a device group.

A highly available service is considered down when the primary fails and when no eligible secondary nodes can be promoted to primary. If service failover occurs and the preferenced property is true, then the nodes follow the order in the nodelist to select a secondary. The nodelist that is set by the defines the order in which nodes will attempt to assume primary control or transition from spare to secondary. You can dynamically change the preference of a device service by using the scsetup(1M) utility. The preference that is associated with dependent service providers, for example a global file system, will be identical to the preference of the device service.

Secondary nodes are check-pointed by the primary node during normal operation. In a multiported disk configuration, checkpointing each secondary node causes cluster performance degradation and memory overhead. Spare node support was implemented to minimize the performance degradation and memory overhead that checkpointing caused. By default, your disk device group has one primary and one secondary. The remaining available provider nodes become spares. If failover occurs, the secondary becomes primary and the node highest in priority on the nodelist becomes secondary.

The desired number of secondary nodes can be set to any integer between one and the number of operational nonprimary provider nodes in the device group.

Note –

If you are using Solaris Volume Manager, you must create the disk device group before you can set the numsecondaries property to a number other than the default.

The default desired number of secondaries for device services is one. The actual number of secondary providers that is maintained by the replica framework is the desired number, unless the number of operational nonprimary providers is less than the desired number. You must alter the numsecondaries property and double-check the nodelist if you are adding or removing nodes from your configuration. Maintaining the nodelist and desired number of secondaries prevents conflict between the configured number of secondaries and the actual number allowed by the framework.